{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "# 房价预测案例（进阶版）\n",
    "\n",
    "这是进阶版的notebook。主要是为了比较几种模型框架。所以前面的特征工程部分内容，我也并没有做任何改动，重点都在后面的模型建造section\n",
    "\n",
    "## Step 1: 检视源数据集"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 读入数据\n",
    "\n",
    "* 一般来说源数据的index那一栏没什么用，我们可以用来作为我们pandas dataframe的index。这样之后要是检索起来也省事儿。\n",
    "\n",
    "* 有人的地方就有鄙视链。跟知乎一样。Kaggle的也是个处处呵呵的危险地带。Kaggle上默认把数据放在*input*文件夹下。所以我们没事儿写个教程什么的，也可以依据这个convention来，显得自己很有逼格。。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "train_df = pd.read_csv('../input/train.csv', index_col=0)\n",
    "test_df = pd.read_csv('../input/test.csv', index_col=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 检视源数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>MSSubClass</th>\n",
       "      <th>MSZoning</th>\n",
       "      <th>LotFrontage</th>\n",
       "      <th>LotArea</th>\n",
       "      <th>Street</th>\n",
       "      <th>Alley</th>\n",
       "      <th>LotShape</th>\n",
       "      <th>LandContour</th>\n",
       "      <th>Utilities</th>\n",
       "      <th>LotConfig</th>\n",
       "      <th>...</th>\n",
       "      <th>PoolArea</th>\n",
       "      <th>PoolQC</th>\n",
       "      <th>Fence</th>\n",
       "      <th>MiscFeature</th>\n",
       "      <th>MiscVal</th>\n",
       "      <th>MoSold</th>\n",
       "      <th>YrSold</th>\n",
       "      <th>SaleType</th>\n",
       "      <th>SaleCondition</th>\n",
       "      <th>SalePrice</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>60</td>\n",
       "      <td>RL</td>\n",
       "      <td>65.0</td>\n",
       "      <td>8450</td>\n",
       "      <td>Pave</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Reg</td>\n",
       "      <td>Lvl</td>\n",
       "      <td>AllPub</td>\n",
       "      <td>Inside</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>2008</td>\n",
       "      <td>WD</td>\n",
       "      <td>Normal</td>\n",
       "      <td>208500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>20</td>\n",
       "      <td>RL</td>\n",
       "      <td>80.0</td>\n",
       "      <td>9600</td>\n",
       "      <td>Pave</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Reg</td>\n",
       "      <td>Lvl</td>\n",
       "      <td>AllPub</td>\n",
       "      <td>FR2</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>2007</td>\n",
       "      <td>WD</td>\n",
       "      <td>Normal</td>\n",
       "      <td>181500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>60</td>\n",
       "      <td>RL</td>\n",
       "      <td>68.0</td>\n",
       "      <td>11250</td>\n",
       "      <td>Pave</td>\n",
       "      <td>NaN</td>\n",
       "      <td>IR1</td>\n",
       "      <td>Lvl</td>\n",
       "      <td>AllPub</td>\n",
       "      <td>Inside</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "      <td>9</td>\n",
       "      <td>2008</td>\n",
       "      <td>WD</td>\n",
       "      <td>Normal</td>\n",
       "      <td>223500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>70</td>\n",
       "      <td>RL</td>\n",
       "      <td>60.0</td>\n",
       "      <td>9550</td>\n",
       "      <td>Pave</td>\n",
       "      <td>NaN</td>\n",
       "      <td>IR1</td>\n",
       "      <td>Lvl</td>\n",
       "      <td>AllPub</td>\n",
       "      <td>Corner</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>2006</td>\n",
       "      <td>WD</td>\n",
       "      <td>Abnorml</td>\n",
       "      <td>140000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>60</td>\n",
       "      <td>RL</td>\n",
       "      <td>84.0</td>\n",
       "      <td>14260</td>\n",
       "      <td>Pave</td>\n",
       "      <td>NaN</td>\n",
       "      <td>IR1</td>\n",
       "      <td>Lvl</td>\n",
       "      <td>AllPub</td>\n",
       "      <td>FR2</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "      <td>12</td>\n",
       "      <td>2008</td>\n",
       "      <td>WD</td>\n",
       "      <td>Normal</td>\n",
       "      <td>250000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 80 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "    MSSubClass MSZoning  LotFrontage  LotArea Street Alley LotShape  \\\n",
       "Id                                                                    \n",
       "1           60       RL         65.0     8450   Pave   NaN      Reg   \n",
       "2           20       RL         80.0     9600   Pave   NaN      Reg   \n",
       "3           60       RL         68.0    11250   Pave   NaN      IR1   \n",
       "4           70       RL         60.0     9550   Pave   NaN      IR1   \n",
       "5           60       RL         84.0    14260   Pave   NaN      IR1   \n",
       "\n",
       "   LandContour Utilities LotConfig    ...     PoolArea PoolQC Fence  \\\n",
       "Id                                    ...                             \n",
       "1          Lvl    AllPub    Inside    ...            0    NaN   NaN   \n",
       "2          Lvl    AllPub       FR2    ...            0    NaN   NaN   \n",
       "3          Lvl    AllPub    Inside    ...            0    NaN   NaN   \n",
       "4          Lvl    AllPub    Corner    ...            0    NaN   NaN   \n",
       "5          Lvl    AllPub       FR2    ...            0    NaN   NaN   \n",
       "\n",
       "   MiscFeature MiscVal MoSold  YrSold  SaleType  SaleCondition  SalePrice  \n",
       "Id                                                                         \n",
       "1          NaN       0      2    2008        WD         Normal     208500  \n",
       "2          NaN       0      5    2007        WD         Normal     181500  \n",
       "3          NaN       0      9    2008        WD         Normal     223500  \n",
       "4          NaN       0      2    2006        WD        Abnorml     140000  \n",
       "5          NaN       0     12    2008        WD         Normal     250000  \n",
       "\n",
       "[5 rows x 80 columns]"
      ]
     },
     "execution_count": 73,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这时候大概心里可以有数，哪些地方需要人为的处理一下，以做到源数据更加好被process。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 2: 合并数据\n",
    "\n",
    "这么做主要是为了用DF进行数据预处理的时候更加方便。等所有的需要的预处理进行完之后，我们再把他们分隔开。\n",
    "\n",
    "首先，SalePrice作为我们的训练目标，只会出现在训练集中，不会在测试集中（要不然你测试什么？）。所以，我们先把*SalePrice*这一列给拿出来，不让它碍事儿。\n",
    "\n",
    "我们先看一下*SalePrice*长什么样纸："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[<matplotlib.axes._subplots.AxesSubplot object at 0x10c05b5f8>,\n",
       "        <matplotlib.axes._subplots.AxesSubplot object at 0x10c095860>]], dtype=object)"
      ]
     },
     "execution_count": 74,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYYAAAEKCAYAAAAW8vJGAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAIABJREFUeJztnX24HVV97z/fEIhEMCe8nYMJcBQVsFUjKkmtVw6vErwF2iuCVuUgvVKRKnqrRGtLaa0C97EGahXaIjf6VEHwBVQkaMmxtiqCZCtKCAFySELI4TUgbxHI7/6x1k7mbPY+e/Y5M7Nnzv59nmc/e2bNmu+sWbP2/GbWd81smRmO4ziOU2dGtwvgOI7jlAsPDI7jOM44PDA4juM44/DA4DiO44zDA4PjOI4zDg8MjuM4zjh6NjBIWivp8Iy0jpb0zSms/05J12VRlryR9A1JR3e7HM70QdKbJK3qdjmc7ahXn2OQtBY4zcxuyEDrJuAMM7tp6iXrLpIGgEuA1wN7A4Nmti6x/A3AF83s9V0qouM4OdOzdwxZIen1wIsmGxQk7ZBxkVptZ62kfVNk3Qp8H/gT4HlXDXE/d5V0cMZFdHqQotq/0xkeGABJO0laKuleSRskfU7SjonlH5O0MS47TdJWSS+NixcDP2rQ2yrpLyTdJel+SRcklp0i6b8k/aOkh4BzYtqPE3l+T9L1kh6SdJ+kJTFdkpZIulPSA5Iul9SXcjdT3Rqa2f1mdjFwM6AW2X4EvDXldp0eJF6ILJH0m9iOL42/s0MlrY+/qfuAL9XTEuvOj12W98d2flFi2Xsl3RY1v5/yYsfpEA8MgU8ChwCvBl4Tpz8JIOkY4CzgcOBlwKGMP8m+CljdRPME4OD4OV7SexPLFgJ3AnsC/xDTLG5vF+AHwLWErpyXAf8R83wIOA74H8CLgUeAL6Tcx1Yn+cmwilBPjjMR7wSOAvYHDiD+poABoA/YF3hfTKu3/xnAd4G1cfk84PK47ARgCeG3tSfwY+BrBexHz+GBIfBO4Fwze8jMHgLOBd4dl50IXGZmt5vZ03FZ8iTbB/y2ieZ5ZvaomW0AlgLvSCy718y+YGZbzWxLw3r/E7jPzJaa2e/M7IlEN9X7gL8ys/vM7Bng74C3xR9TGrIKDr8l7LfjTMQ/mdlGM9tMuACq/waeA84xs2eatP+FhAuij5nZ0/E38JO47H3AZ8zsDjPbCpwHLJC0TwH70lN4YAi8GFiXmL8nptWXrU8sS05DuGrftYnmhhZ6zTSS7APc1WLZfsC3JD0s6WHgNuAZoL8xo6R9JD0S8z5CuPr6ZSLt5AnK0I5dgc1TWN/pDVr9Bh6IFzbNmA/cE0/8jewHXJho/w8R7jTmZVVgJzCz2wUoCRsJja4+ZG6/mAZwH6Gx1tmX8V1JvwJe0URzn4Tevgk9mLi/fz3j7y6SrAPea2Y/nWD9sAGz9cDc+ryku4FDY/pUOQj4ZQY6zvQmeSWf/E21a//7SprRJDisAz5lZt59lDN+xxD4GvBJSXtI2gP4a+ArcdnXgVMlHShpdlyW5FpgqInmRyX1xdvcDxH7SVPwXaBf0gejWbeLpEPiskuAT9cNN0l7Sjoupa5I2ZUkaRbwgjj7gjif5FDCyCXHmYgPSJonaTfg42z/DUzUDn9OuBg7T9JsSbMkvTEuuwT4hKRXAkiaI+lteRW+l+nlwJC8avkUYRTOrwhXwjcTTWEzuw64CFgB3AHU+zu3xOUrgc1xfH+Sq4FfALcA3wG+lKpQZo8TDLvjgE1xm0Nx8YVR93pJj8ayHNJEpql0ynwATwGPxXVuB56sL4j7+biZ3dyBntObfBW4njDQ4k4aBlo0I94l/BHwcsIdwnrg7XHZtwm+wuWSNhN+r8fkVfheJtUDbpJGgUcJY9yfMbNDJM0FriDcIo4CbzezR2P+iwjDOJ8Ahs2slkvpu4CkA4FbgVn1W11JRwHvN7M/ifNbgZeZ2d3dK2k+SLoK+FczW97tsuSJpA8DpxHa/K3AqYQ+8ssJXXS3AO82s2cl7QR8GXgd8CBwUvKhwF4kywdIneJJe8ewFRgys9eaWf0KdQnwQzM7ALiBcKuIpMXA/mb2cuB04OKMy1w4kk6QtGMMhucD1yT7P83sB/WgMN0xs7f1QFB4MfAXwMFm9mqCF/cOwrH/bGzzmwmBg/j9cGzzS4ELnq/qONWhk2GOjXmPB5bF6WVxvp7+ZQAzuxGYI+l5o2YqxunAA8AawiigM9rk7833jEwvdgBeKGkmsDPBOD0M+EZcvowwnh7G/xauAo4osJxlxX8DFSbtqCQDlksy4BIz+zeg38zGAMxsk6S9Yt55jB+OeW9MG8uozIVjZos7zO+P+VcYM9so6bOEPu4nCf3ktwCbE3eKG9g+THJbmzez5yRtlrSbmT1ccNFLg5m9tH0up6ykDQxvjCf/PQnG52paXxE0G3HgVw9OZYivGTme4J89ClxJ8MwaqbfrxjYvvM07FSZVYDCzTfH7AUnfJoyEGZPUb2ZjCm/kvD9m38D48cvzGT+GH4B49+E4U8bMsnzdB8CRwN31K35J3wLeCPQlxtcn23W9zW9UeCnci8zskUZRb/NOVuTQ5sfR1mOIY4l3idMvBI4mjNK4BhiO2YYJwyiJ6e+J+RcRbr+bdiOZ2ZQ/55xzTql0ylim6apjltt5dh2wSNILJIngGfyGMGT5xJjnlIY2f0qcPpEwGKMp3sZ6UydLrSJIc8fQT3gNg8X8/25m10u6Gfh6fDncOuIPxsyulXSspDsJw1VPzansAIyOjpZKJ0st1+kOZvbzOCx3JWGwwUrgXwgPM14u6e9j2qVxlUuBr0haQ3hNw1ReN5KKsh0L1ylWK2/aBgYzWwssaJL+MOGWu9k6Z069aI7TPczsXMILE5OsJbzkrTHvFuJDWI4zHaj8k8/Dw8Ol0slSy3WcVpTtWLhOsVp507W/9pRk3dq2M32QhOVsxGWFt3knC4po85W/YxgZGSmVTpZaruO0omzHwnWK1cqbygcGx3EcJ1u8K8mpNN6V5PQa3pXkOI7jFE7lA8N07kt0HacVZTsWrlOsVt5UPjA4juM42eIeg1Np3GNweg33GKYhAwODSEr1GRgY7HZxHcfpQSofGKrWlzg2dg/hjcztPyFvvuWpsk4vU7Zj4TrFauVN5QOD4ziOky3uMRRMeItz2v1WYa/ZrSruMTi9hnsMjuM4TuFUPjB4X2J7ylZHZaufKlK2Y+E6xWrlTeUDg+M4jpMt7jEUjHsM2eIeg9NruMfgOE5LOnkmxp+NcTqh8oHB+xLbU7Y6Klv9VJGRkZGOnolp9WxM2Y7pdNXJWitvKh8YHCdrJL1C0kpJt8TvRyV9UNJcSddLWi1puaQ5iXUukrRGUk3S8/4j3XGqhHsMBeMeQ7bk3d8qaQawAVgInAk8ZGYXSDobmGtmSyQtBs40s7dKWghcaGaLmmhl2uY7a0vb1vI2VXHcY3Cc7nMkcJeZrQeOB5bF9GVxnvj9ZQAzuxGYI6m/6II6TlZUPjB4X2J7ylZHZaufNpwEfDVO95vZGICZbQL2iunzgPWJde6NablRtmPhOsVq5U3lA4Pj5IWkHYHjgCtjUqs+mGa39d5f41QW9xgKxj2GbMmzv1XSccAZZnZMnF8FDJnZmKQBYIWZHSTp4jh9Rcx3O3Bo/e4ioWennHIKg4ODAPT19bFgwQKGhoaA7VeUaedDW1oBDMUtjMTvieYP29amOt2ez3dnvj49OjoKwLJly3L3GDwwFIwHhmzJOTB8DbjOzJbF+fOBh83sfElLgL5oPh8LfCCaz4uApW4+O3nh5nMKvC+xPWWro7LVTzMk7Uwwnr+ZSD4fOErSauAI4DwAM7sWWCvpTuAS4Iy8y1e2Y+E6xWrlzcxuF8BxyoiZPQXs2ZD2MCFYNMt/ZhHlcpwi8K6kgvGupGzp5XcleVdSb+JdSY7jOE7hVD4weF9ie8pWR2WrnypStmPhOsVq5U3lA4PjOI6TLe4xFIx7DNniHoN7DL2GewyO4zhO4VQ+MHhfYnvKVkdlq58qUrZj4TrFauVN5QOD4ziOky2pPYb4XvqbgQ1mdpykQeByYC5wC/BuM3tW0k6EVxC/DngQOMnM1jXRc4+hfW7vD26DewzuMfQaZfMYPgTclpg/H/ismR0AbAZOi+mnEd4n83JgKXBBFgV1HMdxiiFVYJA0HzgW+LdE8uHAN+L0MuCEOJ38M5OrCO+UyQ3vS2xP2eqobPVTRcp2LFynWK28SXvH8Dngo8T7Vkm7A4+Y2da4fAPb/5hk25+WmNlzwGZJu2VWYsdxHCdX2noMkt4KLDazMyUNAR8B3gv8NHYX1e8ovmdmr5H0a+BoM9sYl90JvMHMHmnQzfTd9FWZP+ywwwjxdSTWxFD8bjZ/NPAMaZg7t59vfvPyru/fdHw3fVa4x+BkQREeQ5rA8GngXcCzwM7ArsC3CWetATPbGt9Bf46ZLZZ0XZy+UdIOwH1mtlcTXTef2+fuKG+v1qcHho7W6sl2Mp0ohflsZp8ws33N7KXAycANZvYuwl9HnRiznQJcHaevifPE5TdkW+TxeF9ie8pWR2WrnypStmPhOsVq5c1UnmNYAnxE0h3AbsClMf1SYA9Ja4CzYj7HcRynIvi7kgrGu5KyxbuSvCup1yhFV5LjOI7TW1Q+MHhfYnvKVkdlq58qUrZj4TrFauVN5QOD4+SBpDmSrpS0StJvJC2UNFfS9ZJWS1ouaU4i/0WS1kiqSVrQzbI7zlRxj6Fg3GPIlrz6WyX9P+BHZnaZpJnAC4FPAA+Z2QWSzgbmmtkSSYuBM83srZIWAhea2aImmu4xOFOmFM8x5LZhDwxpcneUt1frM+sfiaRdgZqZ7d+QfjtwqJmNSRoAVpjZQZIujtNXxHyrgCEzG2tY3wODM2XcfE6B9yW2p2x1VLb6acJLgQclXSbpFkn/Imk20F8/2ZvZJqD+4Oa218BE7mX7K2JyoWzHwnWK1cqbmd0ugOOUkJnAwcAHzOxmSZ8jPI/T6lK72dVb07zDw8OZvQYmMMLEr1VpNs82vVqt1vXXnCTnp3N5arXapNavT9dfA1ME3pVUMN6VlC05dSX1E94F9tI4/yZCYNif2EXUpitpW5dTg653JTlTxruSHKcLxBP6ekmviElHAL8hvO5lOKYNM/41MO8BiO8N29wYFBynSlQ+MHhfYnvKVkdlq58WfBD4d0k14DXApwl/TnWUpNWEYHEegJldC6yNbxK+BDgj78KV7Vi4TrFaeeMeg+M0wcx+CbyhyaIjW+Q/M98SOU5xuMdQMO4xZIu/K8k9hl7DPQbHcRyncCofGLwvsT1lq6Oy1U8VKduxcJ1itfKm8oHBcRzHyRb3GArGPYZscY/BPYZewz0Gx3Ecp3AqHxi8L7E9ZaujstVPFSnbsXCdYrXypvKBwXEcx8kW9xgKxj2GbHGPwT2GXsM9BsdxHKdwKh8YvC+xPWWro7LVTxUp27FwnWK18qbygcFxHMfJFvcYCsY9hmxxj8E9hl7DPQbHcRyncCofGLwvsT1lq6Oy1U8VKduxcJ1itfKm8oHBcRzHyRb3GArGPYZscY/BPYZewz0Gx3Ecp3AqHxi8L7E9ZaujstVPFSnbsXCdYrXypvKBwXHyQNKopF9KWinp5zFtrqTrJa2WtFzSnET+iyStkVSTtKB7JXecqeMeQ8G4x5AtefW3SrobeJ2ZPZJIOx94yMwukHQ2MNfMlkhaDJxpZm+VtBC40MwWNdF0j8GZMu4xOE73EM//fRwPLIvTy+J8Pf3LAGZ2IzBHUn8RhXScPKh8YPC+xPaUrY7KVj8tMGC5pJsk/VlM6zezMQAz2wTsFdPnAesT694b03KjbMfCdYrVypuZ3S6A45SUN5rZJkl7AtdLWk3rfptmt/VN8w4PDzM4OAhAX18fCxYsYGhoCNh+4kg7HxgBhhLTpJhnm16tVpv09vOYn87lqdVqk1q/Pj06OkpRtPUYJM0C/hPYiRBIrjKzcyUNApcDc4FbgHeb2bOSdiLcVr8OeBA4yczWNdF1j6F97o7y9mp95t3fKukc4HHgz4AhMxuTNACsMLODJF0cp6+I+W8HDq3fXSR03GNwpkwpPAYz2wIcZmavBRYAi6PBdj7wWTM7ANgMnBZXOQ142MxeDiwFLsil5I6TE5JmS9olTr8QOBq4FbgGGI7ZhoGr4/Q1wHti/kXA5sag4DhVIpXHYGZPxslZhLsGAw4DvhHTlwEnxOmkQXcVcEQmJW2B9yW2p2x1VLb6aUI/8F+SVgI/A75jZtcTLoaOit1KRwDnAZjZtcBaSXcClwBn5F3Ash0L1ylWK29SeQySZgC/APYH/hm4i3BVtDVm2cB2s22bEWdmz0naLGk3M3s405I7Tk6Y2VrC3XFj+sPAkS3WOTPvcjlOUXT0HIOkFwHfAs4BvmRmr4jp84HvmdlrJP0aONrMNsZldwJvSI4Hj+nuMbTP3VHeXq1Pf1dSR2v1ZDuZThTR5jsalWRmj0n6EbAI6JM0I941zAc2xmwbgH2AjZJ2AF7UGBTqZDlCoyrz26nPD7WZT5s/bKPb+1dE/RU9QsNxeg4zm/AD7AHMidM7E0YoHQtcQRhxBPBF4M/j9BnAF+L0ycDlLXQtC1asWFEqnXZagIGl/HSWN+99K5uO2bb9btuOy/DJqs2bhTrsrH00bydlO6bTVSdLrSLafJo7hr2BZdFnmAFcYWbXSloFXC7p74GVwKUx/6XAVyStAR6KwcFxHMepCP6upIJxjyFb3GNwj6HXKMVzDI7jOE5vUfnA4OOV21O2Oipb/VSRsh0L1ylWK28qHxgcx3GcbHGPoWDcY8gW9xjcY+g13GNwHMdxCqfygcH7EttTtjoqW/1UkbIdC9cpVitvKh8YHMdxnGxxj6Fg3GPIFvcY3GPoNdxjcBzHcQqn8oHB+xLbU7Y6Klv9VJGyHQvXKVYrbyofGBzHcZxscY+hYNxjyBb3GNxj6DXcY3Acx3EKp/KBwfsS21O2Oipb/VSRsh0L1ylWK28qHxgcJy8kzZB0i6Rr4vygpJ9JWi3pa5JmxvSdJF0uaY2kn0rat7sld5yp4R5DwbjHkC159rdK+jDwOsLf0x4n6QrgKjO7UtIXgZqZXSLp/cCrzOwMSScBf2xmz/uDKvcYnCxwj8FxuoSk+YS/sP23RPLhwDfi9DLghDh9fJwHuAo4oogyOk5eVD4weF9ie8pWR2WrnxZ8Dvgo8ZJc0u7AI2a2NS7fAMyL0/OA9QBm9hywWdJueRaubMfCdYrVyps0//nsOD2FpLcCY2ZWkzRUT46fJJZYNk6CFn08w8PDDA4OAtDX18eCBQsYGgqbqJ840s4HRoChxDQp5tmmV6vVJr39POanc3lqtdqk1q9Pj46OUhTuMRSMewzZkkd/q6RPA+8CngV2BnYFvg0cDQyY2VZJi4BzzGyxpOvi9I2SdgDuM7O9mui6x+BMGfcYHKcLmNknzGxfM3spcDJwg5m9C1gBnBiznQJcHaevifPE5TcUWV7HyZrKBwbvS2xP2eqobPXTAUuAj0i6A9gNuDSmXwrsIWkNcFbMlytlOxauU6xW3rjH4DgTYGY/An4Up9cCC5vk2QK8veCiOU5uuMdQMO4xZIu/K8k9hl7DPQbHcRyncCofGLwvsT1lq6Oy1U8VKduxcJ1itfLGPQbH6SlmxS6o9PT378emTaP5FMcpJe4xFIx7DNniHkPnHoP7EtXGPQbHcRyncCofGLwvsT1lq6Oy1U8Vya4Os9EpW9som07WWnlT+cDgOI7jZIt7DAXjHkO2uMfgHkOv4R6D0wFhtEmaz8DAYLcL6zhOial8YPC+xDpbCFeCzT4rxs2Pjd0zqS2Usa57FfcYqqWTtVbeVD4wOI7jONniHkPB5Okx9KIf4R6Dewy9Rik8BknzJd0g6TZJt0r6YEyfK+l6SaslLZc0J7HORZLWSKpJWpDnDjiO4zjZkqYr6VngI2b2SuAPgA9IOpDwzvkfmtkBhD8m+TiApMXA/mb2cuB04OJcSh7xvsQ0jGSjUsK67lXcY6iWTtZaedM2MJjZJjOrxenHgVXAfOB4YFnMtizOE7+/HPPfCMyR1J9xuR3HcZyc6MhjkDRIuMT4fWC9mc1NLHvIzHaX9B3gM2b2k5j+Q+BjZnZLg5Z7DO1z55Z3utS9ewzuMfQapfAYEoXZBbgK+FC8c2jVUpoV2FuV4zhORUj12m1JMwlB4StmVv8D9DFJ/WY2JmkAuD+mbwD2Saw+H9jYTHd4eJjBwUEA+vr6WLBgAUNDQ8D2/rh28/W0tPlbzS9dunRS228231i2xvIG6vNDbebT5q+nNVue1Hp+edPO12o1zjrrrEmvX5+fqH7SHO+RkRFGR0fpZbL1GIamrjIysu1YuU7+WrljZm0/BM/gHxvSzgfOjtNLgPPi9LHA9+L0IuBnLTQtC1asWFEqnXZagIGl/GSVd8Xz8ma9X93QMdu2L6nacdoPMAu4EVgJ3AqcE9MHgZ8Bq4GvATNj+k7A5cAa4KfAvi10M9vvFStWdNg+WrWTxrbRfJ005clqv6ajTpZaebT5xk9bj0HSHwL/GX8g9cdnPwH8HPg64e5gHXCimW2O63weOAZ4AjjVGvyFmMfabXs64h5DtuTV3ypptpk9KWkH4L+BDwEfAa4ysyslfRGomdklkt4PvMrMzpB0EvDHZnZyE81M27x7DL1JER6DP+BWMB4YsiXvH4mk2YQLozOA7wIDZrZV0iLCncRiSdfF6RtjINlkZns20fLA4EyZUpnPZcXHK6dhJBuVEtZ1XkiaIWklsAn4AXAXsNnMtsYsG4B5cXoesB7AzJ4DNkvaLc/y+XMM1dLJWitv/D+fHacJMQC8VtKLgG8BBzXLFr8br95aXpZnMeBivIE5QvsBDI3zyXVrqfNPdYBH2gEOeep3szy1Wm1S69enixxw4V1JBeNdSdlSSH+r9DfAk8DHSNeVdJ+Z7dVEx7uSnCnjXUkVYGBgMPX/IIQfslN2JO1Rf/eXpJ2BI4HbCO8vPzFmOwWoD92+Js4Tl99QXGkdJ3sqHxi63ZcY/tvAGj4rmqTVP91gJBuVEvbb5sTewApJNcKw1eVmdi1hWPZHJN0B7AZcGvNfCuwhaQ1wVsyXK+4xVEsna628cY/BcRows1uBg5ukrwUWNknfAry9gKI5TiG4xzBFOu/ndY8hS/xdSe4x9BruMTiO4ziFU/nAUMa+xKz6bbNjJBuVUtZ1b+IeQ7V0stbKm8oHBsdxHCdb3GOYIu4xdBf3GNxj6DXcY3Acx3EKp/KBoYx9ie4xFKPTy7jHUC2drLXypvKBwXEcx8kW9ximiHsM3cU9BvcYeg33GBzHcZzCqXxgKGNfonsMxej0Mu4xVEsna628qXxgcBzHcbLFPYYp4h5Dd3GPwT2GXsM9BsdxHKdwKh8YytiX6B5DMTq9jHsM1dLJWitvKh8YHMdxnGxxj2GKuMfQXdxjcI+h13CPwXEcxymcygeGMvYlusdQjE4v4x5DtXSy1sqbygcGx8kaSfMl3SDpNkm3SvpgTJ8r6XpJqyUtlzQnsc5FktZIqkla0L3SO87UcY9hirjH0F3y6G+VNAAMmFlN0i7AL4DjgVOBh8zsAklnA3PNbImkxcCZZvZWSQuBC81sURNd9xicKeMeg+N0ATPbZGa1OP04sAqYTwgOy2K2ZXGe+P3lmP9GYI6k/kIL7TgZUvnAUMa+RPcYitEpAkmDwALgZ0C/mY1BCB7AXjHbPGB9YrV7Y1puuMdQLZ2stfJmZrcL4DhlJXYjXQV8yMwel9SqP6XZbX3TvMPDwwwODgLQ19fHggULGBoaArafONLOB0aAocQ0KeaT69ZS5++0fJOZr9Vquep3szy1Wm1S69enR0dHKQr3GKaIewzdJa/+Vkkzge8C3zezC2PaKmDIzMaiD7HCzA6SdHGcviLmux04tH53kdB0j8GZMu4xOE73+BJwWz0oRK4BhuP0MHB1Iv09AJIWAZsbg4LjVInKB4Yy9iW6x1CMTl5I+kPgT4HDJa2UdIukY4DzgaMkrQaOAM4DMLNrgbWS7gQuAc7Iu4zuMVRLJ2utvHGPwXEaMLP/BnZosfjIFuucmV+JHKdY3GOYIu4xdBd/V5J7DL2GewyO4zhO4bQNDJIulTQm6VeJtNK8GqCMfYnuMRSj08u4x1Atnay18ibNHcNlwFsa0pYAPzSzA4AbgI8DxFcD7G9mLwdOBy7OsKyO4zhOAaTyGCTtB3zHzF4d57eN024znnvbuO8mmu4xdDHvdKh7cI/BPYbeo8wew15leTWA4ziOky1ZD1dN/WoAyOb1APW0qT6uvnTp0klvP1CfH2J8v+1Qw/Jm+SeaT5u/ntZs+fPLM9nXA5x11lmTXr8+33jsOlm/Pl3k6wHKSLYew9DUVUZGGl7V4Tp5auWOmbX9APsBv0rMryK8UAxgAFgVpy8GTkrku72er4mmZcGKFSu6qgMYWMNnRZO0+qdZ/rzzNpZncnXf7bpuRtyXVO2425+s2rxZqMPO2kerdjJRW03fXsrWNsqmk6VWEW0+rccwSPAYXhXnzwceNrPzJS0B+iy8l/5Y4AMW3ku/CFhqTd5LHzUszbbLjnsM3cU9BvcYeo1SeAySvgr8BHiFpHWSTiW8CqAUrwZwnKrz5jcfy4wZMzr67LTTzt0utjONaRsYzOydZvZiM5tlZvua2WVm9oiZHWlmB5jZUWa2OZH/TDN7mZm9xsxuybf45Ryv7M8xFKMzXbjrrrWY3YrZs6k/O+10YEZbH8lGpWRto2w6WWvljT/53JPMQlKqz8DAYLcL2yPM6PBTid4zp6L4u5KmSFU9huniR0wHj2HevIPYuPGbwEGptebMOYJHH70B9xh6j1J4DI7j9Drp7zD9TnN6UPnAUMa+RPcYitHpZZ599pGMlEZS5NlCuMuY6LNi3PzY2D2TK03J2ph7DI7jOI6DewxTxj2G7uIeQzEeg/sS5cE9BsdxHKdwKh8YytiX6B5DMTp5Uvb/ISnWYyhOp2xtzD0Gx3GSXIb/D4nTo7jHMEXcY+guefa3Zv0/JO4xOFngHoPjlAtw46IfAAAN6ElEQVT/HxKnJ6h8YChjX6J7DMXolIiO/ockC9xjqJZO1lp5k/Uf9TjOdGZMUn+iK+n+mL4B2CeRbz6wsZlAsz+n2s5I/B5KOV9PS5u/Pp9ct9Zh/vTlm+yfQU32z7bymM+yPLVabVLr16eL/HMq9ximiHsM3SVnj2GQDP+HxD0GJwuK8Bj8jsFxmhD/h2QI2F3SOuAcwv+OXCnpvcA64EQI/0Mi6dj4PyRPAKd2p9SOkw3uMWSsE9Uy1MqCkWxUSlnX+VD2/yFxj6FaOllr5U3lA0MeDAwMpn6LpOM4znTDPYYmdOYbuMfQTfxdSe4x9Br+HIPjOI5TOJUPDGXsS3SPoRidXsY9hmrpZK2VN5UPDI7jOE62uMfQBPcYxuct63EC9xjcY+g93GNwHMdxCqfygaGMfYnuMRSj08u4x1Atnay18qbygcFxHMfJFvcYmuAew/i8ZT1O4B6Dewy9h3sMjuNUlFmp3x5Q/wwMDHa70E6k8oGhjH2J7jEUo9PLlN9j2EK4y0j/GRu7p3RtzD0Gx3Ecx8E9hqa4xzA+b1mPE7jHUGaPwX2JfHCPwXEcxymcygeGMvYlTi+PoTMTMY2BWKW+1rJSfo9hkiol+z27x+A4TUmaiCtIYyA6jlNt3GNognsMk80b8hd5XN1jKK9f4B5DPrjHkCH+r2yOU3b82YeykEtgkHSMpNsl3SHp7Dy2USdtv13o4pioGyTZTTLlUmWgkSUjpdKpUl9rJxTZ7qenx7CFNN2Vabou3WOYGpkHBkkzgM8DbwF+D3iHpAOz3k6dWq2WlVJGOllrZUG56ii7Y1Yeim73zz3324yUytU2ytbGsmyrVWr3edwxHAKsMbN7zOwZ4HLg+Kw38pd/+TfMnj2Xj350CbNnz237ac/mDEuXpVYWZFWeNDrtuwM+/OEPT8dugELafR2zZzNSKrJtFKezeXO5dLLWypuZOWjOA9Yn5jcQfjSZUqv9hqeeWgrcxrPPLmmT+/8Cn8m6CE5T6qOYJuJvgb9lbGxa+TmFtHvHKYI8AkOzX3vmQw1e8IIdmT37n/jd7zYxe/ZtE+bdsuUOtmxppziaVdEy1sqC0WmqUyom3e5nzdqRXXZ5PzNm7Jp6Y48/vjF13okZrbjOrJYDRs4999ym6TNmzGbr1idTb+Hcc8/teJ1W22pVpiT9/fuxadNox9vKksyHq0paBPytmR0T55cAZmbnN+TzcWlOJpRhuGqadu9t3smKvNt8HoFhB2A1cARwH/Bz4B1mtirTDTlOifB270wnMu9KMrPnJJ0JXE8wty/1H4cz3fF270wnuvbks+M4jlNSzCzTD3ApMAb8KpE2l3AltRpYDsxpse4pwB0x34+moPMccAuwErinic7bgF/HfAdPsC/HALfHMp3dYt/Sao0Cv4xleqCJzgXAKsJA7m8AL0pRppumoNOuPH+XWH4dMDDJY5ZWJ3nMvt2srhN5/xLYCuyWokzvybqNT/J3UT9ujwK/TduugYuANfF4LkiknwU8SRgGth74YEw/NOr/Lh7jOYlt3BPTHwPeFNNnAXcBT0etb8b0QeBXMf0x4MKYvlNsV08AT8VjXt/Gx2NZnwJ+mtBZFbUfAz6e0PlOzPsUcDUwMy57JOZ/Grg9Uf5fxPQngHMSdXFZ1Hga+H5C589jXWwhjBJbGHV+GvM+AXwhoXNJTHsa+Elivz4W056Odfoo8MFO6zou+2pi334S62Eqdb0KOLrVOSuRPgj8jNDGvlavo5btNYcfwJuABYxv+OcDH4vTZwPnNVlvLqGBzgH6gHujVkc6cdljbcpzAPBy4AZanMwJ3QF3AvsBOxJ+mO+cjFbMdzcwd4IyHQnMiNPnAZ9JUaY1wAmd6qQszy6J6b8AvjjJY9ZWp/GYtSpTTJ9PCDBraRIYmpTpLlpcQBT1aThuh8Yf5+p27RpYDHwvTi8EfpbYx9FYR33xWK4BDgQ2Av8U890OXBGnvw6sitOfB+5NaN0d62s3wsnpcOCKqHkI8EXgVsLDe+8nBPCPAScRfhfnAa8kBPX/A3yLcDJT1Lkv7vslhCB2YNRZA5wYddYAp0edLcAehJPZnVHnfODBqPNxwkXDgbGONkadhTH99LhfjwFnJNrBPlFnXdyvs2P6W6LOXXG/FsY857VoT/dFrU7r+iDgmbhvfcDjhOA12bqe2VBHzc5ZB8ZtXwGcGKe/CJw+YZvN6YewH+NPDrcD/XF6gHgV0LDOySROGrHwH+hUJy777UTlSaSvoHVgWAR8PzG/JDakjrXi8rXA7u3KFJedAHwlZZk+06nOJMqzBPjnyRyzNDrNjlmrMgFXAq+idWBoVqaT8mjnaT9Njtt5wH0TtOv6SeXiZNkJV4f9LfbxJsLJcUtC63TgkTj9CPC+xDa2NGoBs4H7gU8QTsK3Jcpfi9u5jhCU+oEdYr7b47H9B+AHwBDhBL0wbvf7CZ3V8Xd0XVw2I+o8HNOWxOnd4zrfjzqjwH8kyn9/1LmYEADqF0NrCRdpw8Cjje2AcAJendC5Ny67OE7X6+6OmLexrr8Tj8O2Ouygrt9PuMOYSzipjwKfmmRdJ+8G6nXU9JwVpx9I1NEi4LqJ2mxRL9Hby8zGAMxsE7BnkzyNDwjdS6jUTnUAZkn6uaSfSJrs06fNHliaN0ktCGPal0u6SdL/bpP3vYSDnaZMjXWURidVeSR9StI6wp3S36QoT7NjlkYHUhwzSX8ErDezW1totCrTVI5bFjSW6T7CFV2dxna9V4v16m2wMf0JYP+oO6OuBfwGeGGcfiGhy7O+DSW0NkhaCWwinPRmRM0Nie3uksjfZ2ZjZvYc4SS4V0w/AvgooW09TbibfipR1g3AzjHvvsDDZrY16jxKuAqfR7iqXi7ppsR29yBcrdfLX9cZBDab2da4jfWEu8rfBx6TdJmkW4DXx7z9hG6euk5yv3ZN1N09hPNLY13vHet13iTqejbhpVLrCO1yM9u7tDqt62ZtvGl7kbQ7IWhtTaS/mAko09tVs3wwbl8zOwT4U2ApocF1szwAbzSz1wPHEq6qX990o9JfAc+Y2VenUqY2OqnKY2afNLN9gX8ndANNqjwpdKDhmEl6ScP+7Az8FXBOm+0X8oBlh0x2zHnjevV3WW9Ll7QL8A5C983THerXtczMXks4oc5ne2Bqlb8Zg4Qr9FpDnsb8ltBptkyE50HqbfMgQpdRs2PYTKdeRzsQTn7/bGYHE/r7D2+jk0QN30jaMZancR/TYIRgdhDhTvjFhIuDVu/TmqiuW7XxidKb1XVLigoMY5L6ASTVbwMb2UC4iqgzn3A72qlOPUpjZmsJEfr3JlHmZuWZ9OOmiTI9QPgRL2jMI+kUwo/hnR2UqbGO0uikKk+CrwH/a7LlSaHT7Ji9tiHL/oSTzy8lrY3b+oWkxpNYpsctIxrLtDfhqrhOq3a9gfEXNfV92QDsK2kmcBWhu+H6mL61rkVo90/E6ScIV9H1bVhSC8DMHiN048wlXPXWtz2f0B9ez/+IpP747MZcwjHfBThE0t2E4zyPELBmJfZ9PiF4bSRckc+VNCPqzCFc7W4AXhTL8wDhKrmf0I3yskT5n4o6a4E58SWGxDJvIHS5PGFmN8f0Bwh3HWOEdlTXSe7XY4m62y8eh+SxWww8ROhimkxd7w78zswejncA9bvDTut6wnbRmG5mDwJ9iTpq+5vIKzA0RqhrCH1+EEaMXN1kneXAUZLmSJoLHAX8uFMdSX2SdorTewBvZLs506qszbgJeJmk/aLeyXH7zaLvhFqSZscrOyS9EDia0IeZvBo5hmAyHWdmrV7g0axMP+xUJ2V5XpZY5XhCF0MjbY9ZGp0Wx+w2EnVtZr82swEze6mZvYTwI3itmTVeHDQr0/Jm9VAgjcftOEJfc51kux5me7u+BngPbHuyenPsulhO2K+vELpXBoDlMbg+BHxS4T0RHyYEDAjt5MNx+pPAg1HrJuAtsb72JpzQrgH+A5gp6RDCb21mLNc1hJP1MMHT2BDTP0AIUAcSRkxtIVyc/BBYIGk/4FRCd0pd58GocWIs99WxvO+UtJOkVxJOYt8jeEuvjjqnEU62dZ3HgRNjHe0c834d2FHSwbEdvJnQNr8FvCDu1zAhUNV1tgDDUWcW8E0S7Skei1lTqOvvAvtL2iuW6Q1xnU7r+hrg5FhHLyEEzJ/T/PxQb0s3RA1ofQ7eTg5G21cJ0WgLoS/tVEKk+yHBePoBod8M4HXAvyTWHSZE4zsIQ8o61gH+gDD0ayVhmGQznRMIVydPEfpl6+bY3sB3E+U5Jm5rDcHIabZvbbWAlxBuP1cSRhzUmuisIVxF3RI/X0hRpknppCzPVYllVwN7T/KYtdVpcsyGm9V1Qzu7m2g+tylTmYarriYYpY+Rol3H9T5PuLD5JYnBDQTTst6XPxqP9THAYYQT5e+ifv03shuhnf6OEJTeHNNfFdffEj9XJtrIrTHtt2wffTOLMJz4SUKb/3FiGx+PZb2H7cNVX0K4eq9v9xMJne/F8j9FMHV3jPnrx/1pwoOC9fKvjDpPErqb6nWxjO3DbZcDO8b0v2b7MNOb2T7y6saY90nGG8v/GtOeJgztrO/XcNyvZ4mmckzvqK7jsm/F9C2x7ur7PNm6bjZcdds5K5H+krjfdxBGKO04UXv1B9wcx3GccZTJfHYcx3FKgAcGx3EcZxweGBzHcZxxeGBwHMdxxuGBwXEcxxmHBwbHcRxnHB4YHMdxnHF4YHAcx3HG8f8B3rxJc/rZu/EAAAAASUVORK5CYII=",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x10c05b400>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%matplotlib inline\n",
    "prices = pd.DataFrame({\"price\":train_df[\"SalePrice\"], \"log(price + 1)\":np.log1p(train_df[\"SalePrice\"])})\n",
    "prices.hist()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可见，label本身并不平滑。为了我们分类器的学习更加准确，我们会首先把label给“平滑化”（正态化）\n",
    "\n",
    "这一步大部分同学会miss掉，导致自己的结果总是达不到一定标准。\n",
    "\n",
    "这里我们使用最有逼格的log1p, 也就是 log(x+1)，避免了复值的问题。\n",
    "\n",
    "记住哟，如果我们这里把数据都给平滑化了，那么最后算结果的时候，要记得把预测到的平滑数据给变回去。\n",
    "\n",
    "按照“怎么来的怎么去”原则，log1p()就需要expm1(); 同理，log()就需要exp(), ... etc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "y_train = np.log1p(train_df.pop('SalePrice'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "然后我们把剩下的部分合并起来"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "all_df = pd.concat((train_df, test_df), axis=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "此刻，我们可以看到all_df就是我们合在一起的DF"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2919, 79)"
      ]
     },
     "execution_count": 77,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_df.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "而*y_train*则是*SalePrice*那一列"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Id\n",
       "1    12.247699\n",
       "2    12.109016\n",
       "3    12.317171\n",
       "4    11.849405\n",
       "5    12.429220\n",
       "Name: SalePrice, dtype: float64"
      ]
     },
     "execution_count": 78,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y_train.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 3: 变量转化\n",
    "\n",
    "类似『特征工程』。就是把不方便处理或者不unify的数据给统一了。\n",
    "\n",
    "#### 正确化变量属性\n",
    "\n",
    "首先，我们注意到，*MSSubClass* 的值其实应该是一个category，\n",
    "\n",
    "但是Pandas是不会懂这些事儿的。使用DF的时候，这类数字符号会被默认记成数字。\n",
    "\n",
    "这种东西就很有误导性，我们需要把它变回成*string*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dtype('int64')"
      ]
     },
     "execution_count": 79,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_df['MSSubClass'].dtypes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "all_df['MSSubClass'] = all_df['MSSubClass'].astype(str)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "变成*str*以后，做个统计，就很清楚了"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "20     1079\n",
       "60      575\n",
       "50      287\n",
       "120     182\n",
       "30      139\n",
       "160     128\n",
       "70      128\n",
       "80      118\n",
       "90      109\n",
       "190      61\n",
       "85       48\n",
       "75       23\n",
       "45       18\n",
       "180      17\n",
       "40        6\n",
       "150       1\n",
       "Name: MSSubClass, dtype: int64"
      ]
     },
     "execution_count": 81,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_df['MSSubClass'].value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 把category的变量转变成numerical表达形式\n",
    "\n",
    "当我们用numerical来表达categorical的时候，要注意，数字本身有大小的含义，所以乱用数字会给之后的模型学习带来麻烦。于是我们可以用One-Hot的方法来表达category。\n",
    "\n",
    "pandas自带的get_dummies方法，可以帮你一键做到One-Hot。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>MSSubClass_120</th>\n",
       "      <th>MSSubClass_150</th>\n",
       "      <th>MSSubClass_160</th>\n",
       "      <th>MSSubClass_180</th>\n",
       "      <th>MSSubClass_190</th>\n",
       "      <th>MSSubClass_20</th>\n",
       "      <th>MSSubClass_30</th>\n",
       "      <th>MSSubClass_40</th>\n",
       "      <th>MSSubClass_45</th>\n",
       "      <th>MSSubClass_50</th>\n",
       "      <th>MSSubClass_60</th>\n",
       "      <th>MSSubClass_70</th>\n",
       "      <th>MSSubClass_75</th>\n",
       "      <th>MSSubClass_80</th>\n",
       "      <th>MSSubClass_85</th>\n",
       "      <th>MSSubClass_90</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    MSSubClass_120  MSSubClass_150  MSSubClass_160  MSSubClass_180  \\\n",
       "Id                                                                   \n",
       "1              0.0             0.0             0.0             0.0   \n",
       "2              0.0             0.0             0.0             0.0   \n",
       "3              0.0             0.0             0.0             0.0   \n",
       "4              0.0             0.0             0.0             0.0   \n",
       "5              0.0             0.0             0.0             0.0   \n",
       "\n",
       "    MSSubClass_190  MSSubClass_20  MSSubClass_30  MSSubClass_40  \\\n",
       "Id                                                                \n",
       "1              0.0            0.0            0.0            0.0   \n",
       "2              0.0            1.0            0.0            0.0   \n",
       "3              0.0            0.0            0.0            0.0   \n",
       "4              0.0            0.0            0.0            0.0   \n",
       "5              0.0            0.0            0.0            0.0   \n",
       "\n",
       "    MSSubClass_45  MSSubClass_50  MSSubClass_60  MSSubClass_70  MSSubClass_75  \\\n",
       "Id                                                                              \n",
       "1             0.0            0.0            1.0            0.0            0.0   \n",
       "2             0.0            0.0            0.0            0.0            0.0   \n",
       "3             0.0            0.0            1.0            0.0            0.0   \n",
       "4             0.0            0.0            0.0            1.0            0.0   \n",
       "5             0.0            0.0            1.0            0.0            0.0   \n",
       "\n",
       "    MSSubClass_80  MSSubClass_85  MSSubClass_90  \n",
       "Id                                               \n",
       "1             0.0            0.0            0.0  \n",
       "2             0.0            0.0            0.0  \n",
       "3             0.0            0.0            0.0  \n",
       "4             0.0            0.0            0.0  \n",
       "5             0.0            0.0            0.0  "
      ]
     },
     "execution_count": 82,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.get_dummies(all_df['MSSubClass'], prefix='MSSubClass').head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "此刻*MSSubClass*被我们分成了12个column，每一个代表一个category。是就是1，不是就是0。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "同理，我们把所有的category数据，都给One-Hot了"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>LotFrontage</th>\n",
       "      <th>LotArea</th>\n",
       "      <th>OverallQual</th>\n",
       "      <th>OverallCond</th>\n",
       "      <th>YearBuilt</th>\n",
       "      <th>YearRemodAdd</th>\n",
       "      <th>MasVnrArea</th>\n",
       "      <th>BsmtFinSF1</th>\n",
       "      <th>BsmtFinSF2</th>\n",
       "      <th>BsmtUnfSF</th>\n",
       "      <th>...</th>\n",
       "      <th>SaleType_ConLw</th>\n",
       "      <th>SaleType_New</th>\n",
       "      <th>SaleType_Oth</th>\n",
       "      <th>SaleType_WD</th>\n",
       "      <th>SaleCondition_Abnorml</th>\n",
       "      <th>SaleCondition_AdjLand</th>\n",
       "      <th>SaleCondition_Alloca</th>\n",
       "      <th>SaleCondition_Family</th>\n",
       "      <th>SaleCondition_Normal</th>\n",
       "      <th>SaleCondition_Partial</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>65.0</td>\n",
       "      <td>8450</td>\n",
       "      <td>7</td>\n",
       "      <td>5</td>\n",
       "      <td>2003</td>\n",
       "      <td>2003</td>\n",
       "      <td>196.0</td>\n",
       "      <td>706.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>150.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>80.0</td>\n",
       "      <td>9600</td>\n",
       "      <td>6</td>\n",
       "      <td>8</td>\n",
       "      <td>1976</td>\n",
       "      <td>1976</td>\n",
       "      <td>0.0</td>\n",
       "      <td>978.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>284.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>68.0</td>\n",
       "      <td>11250</td>\n",
       "      <td>7</td>\n",
       "      <td>5</td>\n",
       "      <td>2001</td>\n",
       "      <td>2002</td>\n",
       "      <td>162.0</td>\n",
       "      <td>486.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>434.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>60.0</td>\n",
       "      <td>9550</td>\n",
       "      <td>7</td>\n",
       "      <td>5</td>\n",
       "      <td>1915</td>\n",
       "      <td>1970</td>\n",
       "      <td>0.0</td>\n",
       "      <td>216.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>540.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>84.0</td>\n",
       "      <td>14260</td>\n",
       "      <td>8</td>\n",
       "      <td>5</td>\n",
       "      <td>2000</td>\n",
       "      <td>2000</td>\n",
       "      <td>350.0</td>\n",
       "      <td>655.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>490.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 303 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "    LotFrontage  LotArea  OverallQual  OverallCond  YearBuilt  YearRemodAdd  \\\n",
       "Id                                                                            \n",
       "1          65.0     8450            7            5       2003          2003   \n",
       "2          80.0     9600            6            8       1976          1976   \n",
       "3          68.0    11250            7            5       2001          2002   \n",
       "4          60.0     9550            7            5       1915          1970   \n",
       "5          84.0    14260            8            5       2000          2000   \n",
       "\n",
       "    MasVnrArea  BsmtFinSF1  BsmtFinSF2  BsmtUnfSF          ...            \\\n",
       "Id                                                         ...             \n",
       "1        196.0       706.0         0.0      150.0          ...             \n",
       "2          0.0       978.0         0.0      284.0          ...             \n",
       "3        162.0       486.0         0.0      434.0          ...             \n",
       "4          0.0       216.0         0.0      540.0          ...             \n",
       "5        350.0       655.0         0.0      490.0          ...             \n",
       "\n",
       "    SaleType_ConLw  SaleType_New  SaleType_Oth  SaleType_WD  \\\n",
       "Id                                                            \n",
       "1              0.0           0.0           0.0          1.0   \n",
       "2              0.0           0.0           0.0          1.0   \n",
       "3              0.0           0.0           0.0          1.0   \n",
       "4              0.0           0.0           0.0          1.0   \n",
       "5              0.0           0.0           0.0          1.0   \n",
       "\n",
       "    SaleCondition_Abnorml  SaleCondition_AdjLand  SaleCondition_Alloca  \\\n",
       "Id                                                                       \n",
       "1                     0.0                    0.0                   0.0   \n",
       "2                     0.0                    0.0                   0.0   \n",
       "3                     0.0                    0.0                   0.0   \n",
       "4                     1.0                    0.0                   0.0   \n",
       "5                     0.0                    0.0                   0.0   \n",
       "\n",
       "    SaleCondition_Family  SaleCondition_Normal  SaleCondition_Partial  \n",
       "Id                                                                     \n",
       "1                    0.0                   1.0                    0.0  \n",
       "2                    0.0                   1.0                    0.0  \n",
       "3                    0.0                   1.0                    0.0  \n",
       "4                    0.0                   0.0                    0.0  \n",
       "5                    0.0                   1.0                    0.0  \n",
       "\n",
       "[5 rows x 303 columns]"
      ]
     },
     "execution_count": 83,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_dummy_df = pd.get_dummies(all_df)\n",
    "all_dummy_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 处理好numerical变量\n",
    "\n",
    "就算是numerical的变量，也还会有一些小问题。\n",
    "\n",
    "比如，有一些数据是缺失的："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "LotFrontage     486\n",
       "GarageYrBlt     159\n",
       "MasVnrArea       23\n",
       "BsmtHalfBath      2\n",
       "BsmtFullBath      2\n",
       "BsmtFinSF2        1\n",
       "GarageCars        1\n",
       "TotalBsmtSF       1\n",
       "BsmtUnfSF         1\n",
       "GarageArea        1\n",
       "dtype: int64"
      ]
     },
     "execution_count": 84,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_dummy_df.isnull().sum().sort_values(ascending=False).head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可以看到，缺失最多的column是LotFrontage"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "处理这些缺失的信息，得靠好好审题。一般来说，数据集的描述里会写的很清楚，这些缺失都代表着什么。当然，如果实在没有的话，也只能靠自己的『想当然』。。\n",
    "\n",
    "在这里，我们用平均值来填满这些空缺。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 85,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "LotFrontage        69.305795\n",
       "LotArea         10168.114080\n",
       "OverallQual         6.089072\n",
       "OverallCond         5.564577\n",
       "YearBuilt        1971.312778\n",
       "YearRemodAdd     1984.264474\n",
       "MasVnrArea        102.201312\n",
       "BsmtFinSF1        441.423235\n",
       "BsmtFinSF2         49.582248\n",
       "BsmtUnfSF         560.772104\n",
       "dtype: float64"
      ]
     },
     "execution_count": 85,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mean_cols = all_dummy_df.mean()\n",
    "mean_cols.head(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 86,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "all_dummy_df = all_dummy_df.fillna(mean_cols)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "看看是不是没有空缺了？"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 87,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0"
      ]
     },
     "execution_count": 87,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_dummy_df.isnull().sum().sum()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 标准化numerical数据\n",
    "\n",
    "这一步并不是必要，但是得看你想要用的分类器是什么。一般来说，regression的分类器都比较傲娇，最好是把源数据给放在一个标准分布内。不要让数据间的差距太大。\n",
    "\n",
    "这里，我们当然不需要把One-Hot的那些0/1数据给标准化。我们的目标应该是那些本来就是numerical的数据：\n",
    "\n",
    "先来看看 哪些是numerical的："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 88,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['LotFrontage', 'LotArea', 'OverallQual', 'OverallCond', 'YearBuilt',\n",
       "       'YearRemodAdd', 'MasVnrArea', 'BsmtFinSF1', 'BsmtFinSF2', 'BsmtUnfSF',\n",
       "       'TotalBsmtSF', '1stFlrSF', '2ndFlrSF', 'LowQualFinSF', 'GrLivArea',\n",
       "       'BsmtFullBath', 'BsmtHalfBath', 'FullBath', 'HalfBath', 'BedroomAbvGr',\n",
       "       'KitchenAbvGr', 'TotRmsAbvGrd', 'Fireplaces', 'GarageYrBlt',\n",
       "       'GarageCars', 'GarageArea', 'WoodDeckSF', 'OpenPorchSF',\n",
       "       'EnclosedPorch', '3SsnPorch', 'ScreenPorch', 'PoolArea', 'MiscVal',\n",
       "       'MoSold', 'YrSold'],\n",
       "      dtype='object')"
      ]
     },
     "execution_count": 88,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "numeric_cols = all_df.columns[all_df.dtypes != 'object']\n",
    "numeric_cols"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "计算标准分布：(X-X')/s\n",
    "\n",
    "让我们的数据点更平滑，更便于计算。\n",
    "\n",
    "注意：我们这里也是可以继续使用Log的，我只是给大家展示一下多种“使数据平滑”的办法。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 89,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "numeric_col_means = all_dummy_df.loc[:, numeric_cols].mean()\n",
    "numeric_col_std = all_dummy_df.loc[:, numeric_cols].std()\n",
    "all_dummy_df.loc[:, numeric_cols] = (all_dummy_df.loc[:, numeric_cols] - numeric_col_means) / numeric_col_std"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 4: 建立模型\n",
    "\n",
    "#### 把数据集分回 训练/测试集"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 90,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "dummy_train_df = all_dummy_df.loc[train_df.index]\n",
    "dummy_test_df = all_dummy_df.loc[test_df.index]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 91,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "((1460, 303), (1459, 303))"
      ]
     },
     "execution_count": 91,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dummy_train_df.shape, dummy_test_df.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 92,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "X_train = dummy_train_df.values\n",
    "X_test = dummy_test_df.values"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 做一点高级的Ensemble\n",
    "\n",
    "一般来说，单个分类器的效果真的是很有限。我们会倾向于把N多的分类器合在一起，做一个“综合分类器”以达到最好的效果。\n",
    "\n",
    "我们从刚刚的试验中得知，Ridge(alpha=15)给了我们最好的结果"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 93,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from sklearn.linear_model import Ridge\n",
    "ridge = Ridge(15)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Bagging\n",
    "\n",
    "Bagging把很多的小分类器放在一起，每个train随机的一部分数据，然后把它们的最终结果综合起来（多数投票制）。\n",
    "\n",
    "Sklearn已经直接提供了这套构架，我们直接调用就行："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 94,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from sklearn.ensemble import BaggingRegressor\n",
    "from sklearn.model_selection import cross_val_score"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在这里，我们用CV结果来测试不同的分类器个数对最后结果的影响。\n",
    "\n",
    "注意，我们在部署Bagging的时候，要把它的函数base_estimator里填上你的小分类器（ridge）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "params = [1, 10, 15, 20, 25, 30, 40]\n",
    "test_scores = []\n",
    "for param in params:\n",
    "    clf = BaggingRegressor(n_estimators=param, base_estimator=ridge)\n",
    "    test_score = np.sqrt(-cross_val_score(clf, X_train, y_train, cv=10, scoring='neg_mean_squared_error'))\n",
    "    test_scores.append(np.mean(test_score))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 96,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYMAAAEKCAYAAADw2zkCAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XmclWX9//HXG1mUAHfzK4hLhIq78FNM+TamJpq5m1gWfm0xSS1X0CyHrxvkktvXtEJcUrFcErFMU6ayUhFR2TEXFhE0U3FBHeHz++O6Bw/DmZkz48zcZ2bez8djHnPf1719zg1zPue6rvu6jiICMzPr2DrlHYCZmeXPycDMzJwMzMzMycDMzHAyMDMznAzMzAwnAzMzw8nAypykP0j6Zt5xmLV3TgZWNiSdL+mWwrKIOCgibm2Ba20haaWksvwbkPR1SVMkvSPpFUkPSNpL0jBJLxXZfy1JSyUdVGTbcEkfS1qW/byT/d60dV6NtQVl+Ydg1goERPa78QdLazVvOKud+3TgCuBCYBOgL3AdcAhwL7CupP+uddiBwErgwTpO+4+I6JX99Mx+Lyly7TVeV1Nea7kmWaub/8GsXpJeknSGpGclvSnpDkldSzjuYEnTsmMek7RjwbaRkhZln05nS9pH0gHAucAx2SfXadm+kyWdkC0Pz851RXbef0naMytfIGmJpG8VXOcgSU9LelvSfEnnF4T4l+z3W1kceyg5T9LL2bluktQrO1dNTeIESfOBR4q85lmFn8yzT+uvS9pFUjdJt0r6dxb7E5I2LnKOXsBoYERE3BcRyyNiRUQ8EBEjI+JD4HfAt2od+k3gtohY2dC/TZFrviTpbEnPAu9mcdcu6yRpu+zf401J0yV9teAc4yVdl9Vg3gEqGhuH5Swi/OOfOn+Al4DHgc8C6wGzgO81cMxuwFJgEOmT9zez83QB+gMLgM9m+/YFtsqWzwduqXWuycAJ2fJw4CPSG6GAC4D5wDXZufcHlgHds/3/G9g+W94BeBU4JFvfAlgBqOBaJwDzsm3dgbtr4snKVgI3AesA3Yq87vOA3xSsfwWYlS1/D7gP6JbFvivQo8g5DsheY6d67u8XgLdqYgB6Ae8DO9ax/3Dgrw38Gz8NbFZwztXKgM7A88DIbHmf7F5/Ptt/PPAmMDhb75r3/13/NO7HNQMrxVURsTQi3gLuB3ZpYP/vANdHxFOR3Ap8CAwmvQF3BXaQ1DkiFkTEGm3g9XgpIm6J9I5zJ9AHGB0R1RHxMOmNtB9ARPw1ImZmyzOACcAXa52vsJno68AVETE/It4HzgGGFTR5BHB+pE/rHxaJ7Q7gEElrZ+vHArdny9XAhkD/7J5Mi4h3i5xjQ+DfUc8n/Ij4BynZHp4VHQPMjYjpdR0D7CnpP9nPm5Ker7X9qohYXOt1FZYNBj4TEWMj4uOImAxMyl5jjfsi4vEsxo/qicXKkJOBlWJpwfL7QI8G9t8COKPwzYf0pr1ZRLwA/AioBJZKur2RHZmFsSwHiIh/1yrrAZA1/Twq6TVJbwEnAhvVc+7NSDWNGvNJn4I/W1C2qK6Ds9c2C/iqpHVIbfw1yeBW4E/AhKyJbEwdbfFvABuV0OZ+K580FR0H3NzA/v+MiA2yn/Uj4vO1thd7XYVlmwELa22fD/QuWK+93doQJwNrCQuBi2q9+fSIiDsBImJCRAwhJQ2Asdnv5p5P/Tbg90DviFgPuIFPagLFrrW4ICay5WpWT0ANxTiBVMM4FJgZES8CZJ+mL4iI7UnNPF9lzXZ/gH8CHwCHNXCdW4B9JQ0G9uCTpNNUxV5XYdliYPNa2/sCrzRwDmsjnAysJfwK+L6k3QEkfSbrzP2MpP5Zh3FXUpPOclLTEaQ33S0lNeYJn/r27QG8GRHVWSxfL9j2OqkP4HMFZXcAp0naUlIP4CJgQkGTTSlxTQC+DJxEwRu0pApJO2Sf+N8lJZkVtQ+OiGWkvpP/k3SopHUkdZY0VNKYgv0WAH/PYn44Il5rIK4mPTVV4AngvaxTubOkCuDg7PrWDpSUDLL/iHMkzZM0ssj2IZKmSqqWdESR7T2zqvHVBWVdJN0gaW72FMbhtY+zstDoT3sRMRX4LnCtpP+QOmWHZ5u7AWNIb8aLgY1JTxFBekpGwBuSnirx+rW3F66PAC6Q9Dapc/fOghiXk97s/541Ze0O3Ehqfvkr8AKpSezUeq61ZjDpcc1/ktrY7yzYtClwF/A2MJPUMf6bOs7xc+D0LObXSB3uPyDVcgrdTPp03lATEcBgrTnOYGA9r2u1soioJjV7HQT8G7gW+GZEPF9sf2t7lPrh6tkhfZKZB+xL+uOdAgyLiDkF+/QlPdFwJjAxIu6pdY4rSW21/4mIU7OyStITEz/N1jeIiP800+syM7NG6FzCPrsDz0fEfABJE0jtoauSQVZlRdIamSX79LEJaTDMoIJNJwDbFJzDicDMLCelNBP1ZvWnBBax+hMEdcrafi8DzqKgzVLSutnihVnz0p3FBuBY+ZJ0TkFzQ+HPA3nHZmaNV0oyKNbxVGr74AjggYioeeKg5lydSY8a/i0iBpIGNV1e4jmtDETEJfHJtAaFP1/JOzYza7xSmokWkTqpavQh9R2UYk9gb0kjgJ5AF0nvRMS5kt6LiJoOsd+Rmo3WUKzpyczMGhYRJT9FVkrNYArQT2lulq7AMGBiPfuvunhEHBcRW0bE1qTO5VsioubJkfsl7ZMt70carFNUnkO0S/05//zzc4+hPcToOB1nuf+0lTgbq8FkEBErgJOBh0iPxE2IiNmSRks6GEDSIEkLgaOA6yXVNyy+xiigUtIzwDeAMxodvZmZNYtSmomIiAcpePInKzu/YPkp1hydWPscN1PwPHSkJ5BqzxNjZmY58AjkZlJRUZF3CA1qCzGC42xujrN5tZU4G6vBQWd5kxTlHqOZWbmRRDRzB7KZmbVzTgZmZuZkYGZmTgZmZoaTgZmZ4WRgZmY4GZiZGU4GZmaGk4GZmeFkYGZmOBmYmRlOBmZmRhtPBm++Cf/6V95RmJm1fW06GUycCMcfD57U1Mzs02nTyeC442DZMrjvvrwjMTNr29r89xk8+CD88IcwYwZ06dKKgZmZlbEO930GBxwAm28O48blHYmZWdtVUjKQNFTSHEnzJI0ssn2IpKmSqiUdUWR7T0mLJF1dZNtESc81LXyQ4Gc/g9Gj4Z13mnoWM7OOrcFkIKkTcC1wALA9cKykbWvtNh8YDtxWx2kuAKqKnPtwYFkj4i1qt91g333h8ss/7ZnMzDqmUmoGuwPPR8T8iKgGJgCHFu4QEQsiYgawRuO+pIHAJsBDtco/A5wGXNjE2Fdz4YVwzTXw6qvNcTYzs46llGTQG1hYsL4oK2uQJAGXAWcBtTsyLsi2LS/lXA3Zckv4n/9JzUVmZtY4pSSDYr3RpT6CNAJ4ICJeWe2E0s5Av4iYmJ2/5B7v+px7Ltx9N8ye3RxnMzPrODqXsM8ioG/Beh9gcYnn3xPYW9IIoCfQRdK7wAJgN0kvAl2ATSQ9GhFfKnaSysrKVcsVFRVUVFQUvdgGG8DIkXDOOfD735cYoZlZO1BVVUVVVVWTj29wnIGktYC5wL7Aq8CTwLERscbnb0njgUkRcXeRbcOBgRFxaq3yLYD7I2KnOq5f7ziD2j74ALbZBn7zGxgypOTDzMzalWYfZxARK4CTSR3AM4EJETFb0mhJB2cXHSRpIXAUcL2k6U0L/9Nbe2246CI46yxPU2FmVqo2PwK5mJUrYeDA1Idw9NEtFJiZWRlrbM2gXSYDgD//Gb7/fZg1C7p2bYHAzMzKWIebjqIu++0H/frBDTfkHYmZWflrtzUDgOeeg/33h3nzYN11mzkwM7My5ppBgZ12goMOSnMXmZlZ3dp1zQBg4ULYZRd49lno06cZAzMzK2PuQC7inHPgtdc8zbWZdRxOBkW8/Tb075+eMNpxx2YKzMysjLnPoIh1101jDkaNyjsSM7Py1CGSAcBJJ8GcOfDoo3lHYmZWfjpMMujaFS6+GM4+O41QNjOzT3SYZABpaopOneDOO/OOxMysvHSIDuRCVVXpS3DmzIFu3ZrttGZmZcUdyA2oqIAddoDrrss7EjOz8tHhagYAM2fCPvvA3Lmw/vrNemozs7LgmkEJtt8eDj0UxozJOxIzs/LQIWsGAIsXpwFo06ZB374N729m1pa4ZlCizTaDESPgJz/JOxIzs/x12JoBwLJlaZqKBx9Mk9mZmbUXrhk0Qq9eqWZw9tl5R2Jmlq+SkoGkoZLmSJonaWSR7UMkTZVULemIItt7Slok6epsfR1JkyTNljRd0sWf/qU0zfe+By+/DA89lFcEZmb5azAZSOoEXAscAGwPHCtp21q7zQeGA7fVcZoLgKpaZZdGxHbArsDekg5oRNzNpksXuOSSVDtYsSKPCMzM8ldKzWB34PmImB8R1cAE4NDCHSJiQUTMANZo3Jc0ENgEeKhg/+UR8Zds+WPgaSC3r5454gjo3h1uqyuVmZm1c6Ukg97AwoL1RVlZgyQJuAw4CyjakSFpPeCrwCOlnLMlSHDppXDeebB8eV5RmJnlp3MJ+xR7Ey/18Z4RwAMR8UrKC6ufS9JawO3AlRHxcl0nqaysXLVcUVFBRUVFiZcv3V57waBBcM017lA2s7anqqqKqqqqJh/f4KOlkgYDlRExNFsfBUREjC2y73jg/oi4J1v/DbA3sBLoCXQBrouIc7Pt44BlEXFaPddvsUdLa5s7F/beO01it+GGrXJJM7MW0RKPlk4B+knaQlJXYBgwsb4YahYi4riI2DIitgbOBG4pSAQXAr3qSwStbZtt0jTXF12UdyRmZq2rpEFnkoYCV5GSx7iIGCNpNDAlIiZJGgTcC6wHfAAsiYgda51jODAwIk6VVNMPMRv4iNTsdG1E3Fjk2q1WMwBYuhQGDICnnoKttmq1y5qZNavG1gw69Ajkuvzv/6amottvb9XLmpk1GyeDZvDuu2maiokTU6eymVlb4+komkGPHlBZCWedBWWeK83MmoWTQR1OOAGWLIE//jHvSMzMWp6TQR06d4axYz1NhZl1DE4G9fjqV2GDDeDmm/OOxMysZbkDuQFPPAFHHgnz5qX5i8zM2gJ3IDezPfaAL3wBrrwy70jMzFqOawYl+Ne/YPBgmD0bNt4411DMzEricQYt5NRT0++rr843DjOzUjgZtJDXX4fttoN//hM+//m8ozEzq5/7DFrIxhvD6afDuefmHYmZWfNzzaAR3n8/TVNx112pD8HMrFy5ZtCCundPk9h5mgoza2+cDBpp+HB46600iZ2ZWXvhZqIm+OMf4bTTYMaMNG2FmVm5cTNRKxg6FHr3hnHj8o7EzKx5uGbQRFOnprmL5s1LU16bmZUT1wxaycCBsM8+cPnleUdiZvbpuWbwKbz0UvomtJkzYdNN847GzOwTLVIzkDRU0hxJ8ySNLLJ9iKSpkqolHVFke09JiyRdXVC2m6TnsnO2yWngttoKjj8eRo/OOxIzs0+nwWQgqRNwLXAAsD1wrKRta+02HxgO3FbHaS4AqmqV/QL4TkT0B/pLOqARcZeNH/84DUKbMyfvSMzMmq6UmsHuwPMRMT8iqoEJwKGFO0TEgoiYAazRniNpILAJ8FBB2aZAz4h4Miu6BTisaS8hXxtskL4N7Zxz8o7EzKzpSkkGvYGFBeuLsrIGSRJwGXAWUNh21Ts7T6PPWY5OOQWefhoeeyzvSMzMmqaUIVPFOiBK7dEdATwQEa+kvNC0c1ZWVq5arqiooKKiosTLt46114YLL0zTVPzjH6CSu2zMzJpHVVUVVVVVTT6+waeJJA0GKiNiaLY+CoiIGFtk3/HA/RFxT7b+G2BvYCXQE+gCXAdcDUyOiO2y/YYBX4yIk4qcs2yfJiq0ciXsthv85CfpazLNzPLUEk8TTQH6SdpCUldgGFDfzDyrLh4Rx0XElhGxNXAmcEtEnBsRS4BlknbPmpK+BdxXatDlqFMnuPRSGDUKqqvzjsbMrHEaTAYRsQI4mdQBPBOYEBGzJY2WdDCApEGSFgJHAddLml7CtUcA44B5pA7qB5v6IsrF/vvD1lvDL3+ZdyRmZo3jQWfN7Jln0txF8+ZBr155R2NmHZWno8jZLrvAAQekJiMzs7bCNYMWsGAB7LorPPdcmt3UzKy1NbZm4GTQQkaOhDfegF//Ou9IzKwjcjIoE2+9lb4v+dFHYYcd8o7GzDoa9xmUifXWg3PPTY+ampmVOyeDFnTSSTBrFkyenHckZmb1czJoQd26wcUXp4nsVq7MOxozs7o5GbSwr30t/f7tb/ONw8ysPu5AbgWTJ8O3vw2zZ6fagplZS3MHchnaZx8YMAB+8Yu8IzEzK841g1YyYwZ86Utpmor11ss7GjNr71wzKFM77ACHHAJjxuQdiZnZmlwzaEWvvAI77QTTpkHfvnlHY2btmUcgl7nzzoNFi+Cmm/KOxMzaMyeDMrdsWZqm4k9/gp13zjsaM2uv3GdQ5nr1SrWDkSPzjsTM7BNOBjn43vfghRfg4YfzjsTMLHEyyEHXrnDJJZ6mwszKh5NBTo48Mo1Gvv32vCMxMysxGUgaKmmOpHmS1mjtljRE0lRJ1ZKOKCjvK+kpSU9Lmi7pxIJtx0p6TtIzkv4gaYPmeUltg5S+GvPHP4YPPsg7GjPr6Bp8mkhSJ2AesC+wGJgCDIuIOQX79AV6AWcCEyPinqy8c3aNakndgZnAnsDr2bm2jYg3JY0F3ouI/y1y/Xb1NFFthx0Ge+8NZ56ZdyRm1p60xNNEuwPPR8T8iKgGJgCHFu4QEQsiYgYQtco/zo4BWAeoCazmd09JIiWSxaUG3Z6MGQNjx8J//pN3JGbWkZWSDHoDCwvWF2VlJZHUR9KzwHxgbEQsiYiPgRHA9Ox82wHjSo66Hdl2WzjqKLjoorwjMbOOrHMJ+xSrZpTcbhMRi4CdJW0K3CfpLuBN4CRg54h4WdI1wLlA0bfEysrKVcsVFRVUVFSUevk24fzzYfvt4eSTYaut8o7GzNqiqqoqqqqqmnx8KX0Gg4HKiBiarY8CIiLGFtl3PHB/TZ9Bke03ApOABcAlEbF/Vj4EGBkRBxc5pl33GdQYPTrNaHrbbXlHYmbtQUv0GUwB+knaQlJXYBgwsb4YCoLpLWntbHl9YC9gLvAKMEDShtmu+wOzSw26PTrjjPQlOFOn5h2JmXVEJc1NJGkocBUpeYyLiDGSRgNTImKSpEHAvcB6wAfAkojYUdJ+wOXASlKSuCYixmXn/B7wI+AjUn/C8RHxZpFrd4iaAcANN8Cdd8Ijj6RHT83MmsoT1bVhH3+cvvfg5z+HAw/MOxoza8s8UV0b1rlzesz07LNhxYq8ozGzjsTJoMwcckj6Wsxbbsk7EjPrSNxMVIYefzyNPZg3D7p3zzsaM2uL3EzUDgweDHvuCVddlXckZtZRuGZQpp5/PiWE2bNh443zjsbM2ho/TdSOnHIKdOrkGoKZNZ6TQTvy2mswYEDqQ+jXL+9ozKwtcZ9BO7LJJnDaaek7D8zMWpJrBmXu/fehf3+4+27YY4+8ozGztsI1g3ame/c0id3ZZ0MHzolm1sKcDNqA4cPhjTdg0qS8IzGz9srJoA2omaZi5Mg0f5GZWXNzMmgjDjoIPvtZGD8+70jMrD1yB3Ib8tRTcOihMHcu9OiRdzRmVs7cgdyODRoEX/wiXHFF3pGYWXvjmkEb89JLKSnMmpWajczMivEI5A7g9NPhgw/guuvyjsTMypWTQQfwxhuw7bbw2GOwzTZ5R2Nm5ch9Bh3AhhvCWWfBOefkHYmZtRclJQNJQyXNkTRP0sgi24dImiqpWtIRBeV9JT0l6WlJ0yWdWLCti6QbJM2VNEvS4c3zkjqGU05JTxf9/e95R2Jm7UGDzUSSOgHzgH2BxcAUYFhEzCnYpy/QCzgTmBgR92TlnbNrVEvqDswE9oyIJZIqgU4R8dNs3w0i4j9Fru9mojrccgtcf31KCCq5MmhmHUFLNBPtDjwfEfMjohqYABxauENELIiIGUDUKv84OwZgHaAwsBOASwr2XSMRWP2+8Y00kd299+YdiZm1daUkg97AwoL1RVlZSST1kfQsMB8Ym9UK1s02X5g1L90pyd/n1UhrrQU/+xmMGgXV1Q3vb2ZWl84l7FOsmlFyu01ELAJ2lrQpcJ+ku4CVQB/gbxFxhqTTgMuBbxU7R2Vl5arliooKKioqSr18u/flL8OWW8KvfgUjRuQdjZnlpaqqiqqqqiYfX0qfwWCgMiKGZuujgIiIsUX2HQ/cX9NnUGT7jcCkiLhH0jsR0TMr7wP8MSJ2LHKM+wwaMG0aHHhg+t7knj3zjsbMykFL9BlMAfpJ2kJSV2AYMLG+GAqC6S1p7Wx5fWAvYG62+X5J+2TL+wGzSg3aVrfrrqmGcOmleUdiZm1VSYPOJA0FriIlj3ERMUbSaGBKREySNAi4F1gP+ABYEhE7StqP1PyzkpQkromIcdk5+wK3AusCrwP/kzUp1b62awYlmD8fdtsNpk+HzTbLOxozy5tHIHdgZ58Nb70Fv/xl3pGYWd6cDDqwN99M01NUVcGAAXlHY2Z58nQUHdj666fHTEeNyjsSM2trXDNoZz78ME1id9NN6bsPzKxjcs2gg+vWDS6+OE1k5xxqZqVyMmiHjjkGVq6E3/0u70jMrK1wM1E7NXkyfOc76RvRunXLOxoza21uJjIA9tkn9R1cf33ekZhZW+CaQTs2Ywbsuy/MnQvrrZd3NGbWmlwzsFV22AEOPhjGrjGLlJnZ6lwzaOcWLYKdd4ZnnoHNN887GjNrLR6BbGv48Y9h8WIYPz7vSMystTgZ2Brefhv694eHH4addso7GjNrDe4zsDWsuy6cd1768ptnnvFgNDNbUynfdGbtwIknwtKlcPjhsPbaMGxY+tlmm7wjM7Ny4GaiDiYCnngCJkyA3/4WNt00JYVjjoEttsg7OjNrLu4zsJKtWAF//WtKDHffnWoJw4bB0UenJGFmbZeTgTXJRx/Bn/+cEsP998PAgSkxHHlkmhrbzNoWJwP71JYvhz/8ISWGhx6C//5vOPZYOOQQ6NEj7+jMrBROBtasli2DiRPhjjvgscfggANSYjjwwNQRbWblqUUeLZU0VNIcSfMkjSyyfYikqZKqJR1RUN5X0lOSnpY0XdKJRY6dKOm5UgO21tWrFxx3HDzwALz4Iuy/P1xzDWy2GRx/PDz4IFRX5x2lmX1aDdYMJHUC5gH7AouBKcCwiJhTsE9foBdwJjAxIu7Jyjtn16iW1B2YCewZEUuy7YcDRwI7RUTR4VCuGZSnV19NTyNNmAAvvJD6FoYNgyFDoJNHr5jlriVqBrsDz0fE/IioBiYAhxbuEBELImIGELXKP86OAVgHWBWYpM8ApwEXlhqslY//+i/44Q/hn/9Mj6pusUVa33xzOP10ePJJD24za0tKSQa9gYUF64uyspJI6iPpWWA+MLamVgBcAFwGLC/1XFaettoKRo1Ko5v//Gfo2TM1LfXrl+ZFmj497wjNrCGljEAuVs0o+TNfRCwCdpa0KXCfpLuAzYB+EXG6pC3ruMYqlZWVq5YrKiqoqKgo9fLWyrbbDkaPhspKmDYtNSN95Sup76Fm1HO/fnlHadb+VFVVUVVV1eTjS+kzGAxURsTQbH0UEBGxxiz5ksYD99f0GRTZfiMwCdgEOA/4COiSrf89Ir5U5Bj3GbRxK1em5qQJE9L3Mm+++Sejnvv0yTs6s/ap2R8tlbQWMJfUgfwq8CRwbETMLrLveGBSRNydrfcG3oiIDyStDzwOHBERMwuO2YKUQNyB3AF8/DFUVaXEcO+9sP326VHVo46CjTfOOzqz9qNFxhlIGgpcRepjGBcRYySNBqZExCRJg4B7gfWAD4AlEbGjpP2Ay4GVpKagayJiXK1zOxl0UB9+mAa13XFHGuS2xx4pMRx+eJpp1cyazoPOrE167700lmHChDS47YIL4Lvf9WOqZk3lZGBt3vTp8P3vp76G669PX9tpZo3jL7exNm/HHeFvf4NvfzuNeD7zTHj33byjMmvfnAysLHXqBN/5DsycCf/+NwwYAL//vQeymbUUNxNZm1BVBSedBJ//fJobyV/EY1Y/NxNZu1RRAc8+m544GjgQfvYzT5Bn1pxcM7A254UX4Ac/gFdeSR3Me+2Vd0Tl6+OPU0d81655R2KtrbE1g1KmozArK5/7HPzxj2k089e+BgcdBGPGwIYb5h1Z+XjzTbjhhtSk9vrrsNFG0Ldv+tl889V/9+2bBvyp5LcNa00R6Qunli2Dd95JvwuXi5W9917jr+OagbVpb78NP/lJmk577Fj41rc69pvaSy/BlVfCrbfCwQenGWR32CFNOb5gASxcmH4XLi9cmJ7W2nzz1RNE7aThb7krXUQaVNmYN/C6tr3zTqrZ9eyZ5viq/btYWc+ecPTRHmdgHdBTT6WxCT16wC9+kSbM60ieeAIuuwwmT06P5J5ySuPmfXrvPVi0qHiiqClbe+36axebbQZdurTca2wNH33U+DfrusrWWqtxb+B1van37Nm0++pBZ9ZhrViREsHo0XDiiWn67HXWyTuqlrNiRfpK0ssvT/0nP/oRnHBCevNobhHwxhv11y6WLoVNNimeKGqWN9qo+Wtu1dWffIL+tJ/EV65s/Jt1sW09e0K3bs37OhvLycA6vMWL4bTTUm3h//4Phg7NO6Lm9f77cNNN8POfwwYbpEF5hx8OnXPuAayu/qQ5qq7axfLlayaLmuYpqWmfxKurG9+EUte2bt3aTzOjk4FZ5sEH01NHAwemdvTNNss7ok9nyRK49trUMbz33nDGGelJqrb05vXuuyk51E4UCxem19GUN/B11mlb96C1OBmYFVi+HC6+OD2C+tOfwogRqS23LZk5E664Au65J83qetppafCdWX2cDMyKmD07jWB+992UGAYNyjui+kXAI4+k/oBp01IN56STUpu7WSmcDMzqEJEeuTz7bDj6aLjwwvL73oSPPoI770xJoLo6PRr6jW+kJ3nMGsPTUZjVQUrjEGbOTM+ADxiQ3njL4bPGW2+lcRJbb506hy+5JE3l/e1j4AJnAAAJOUlEQVRvOxFY63DNwDqsv/89jU3o3Ts9dfS5z7V+DC+9BFddBbfcAl/5SqoJ7Lpr68dh7Y9rBmYl2msvePpp2HffNAHehRemGkNreOKJNJXGoEFpdOlzz6UmLCcCy0tJyUDSUElzJM2TNLLI9iGSpkqqlnREQXlfSU9JelrSdEknZuXrSJokaXZWfnHzvSSz0nXpAmedBVOnwpNPpm9Vmzy5Za61YkX6ToYhQ+CYY+ALX4CXX04zsDZmtLBZS2iwmUhSJ2AesC+wGJgCDIuIOQX79AV6AWcCEyPinqy8c3aNakndgZnAnsDbwO4R8Zdsn0eBiyLiT0Wu72YiaxURcN99cOqpsM8+cOmlaUTtp1U4SGz99dP4gCOPzH+QmLVvLdFMtDvwfETMj4hqYAJwaOEOEbEgImYAUav84+wYgHUAZeXLI+IvNfsATwP+bGS5kuCww2DWrDSL5w47wK9+laYoaIqlS9MkeltuCQ89BDfemJqHjjnGicDKTynJoDewsGB9UVZWEkl9JD0LzAfGRsSSWtvXA74KPFLqOc1aUo8eadK3hx9Ob+B7753a9Es1c2Z6CmjbbdNXdj722CfNQx4pa+WqlGRQ7L9vye02EbEoInYG+gHHS9p41YmltYDbgSsj4uVSz2nWGnbeOT1xNHx46mQ+66y654mvGSR20EFp3y23hOefTxPn9e/fqmGbNUkpldVFQN+C9T6kvoNGiYglkmYCQ4B7suJfAnMj4pr6jq2srFy1XFFRQUVFRWMvb9YknTqlGVAPOyxNCDdgQPrCmEMOSdurqz8ZJPbhh+nR0Hvu8dgAa31VVVVUVVU1+fhSOpDXAuaSOpBfBZ4Ejo2I2UX2HQ9Mioi7s/XewBsR8YGk9YHHgSMiYqakC4FtIuLoBq7vDmQrG48+mqaF2G47GDw4TRzXv3/qFD7wwJQ8zMpBi0xHIWkocBWpWWlcRIyRNBqYEhGTJA0C7gXWAz4AlkTEjpL2Ay4HVpKam66JiHFZklgIzAY+IjU7XRsRNxa5tpOBlZUPP0xPGr34Ipx8Muy2W94Rma3JcxOZmZlHIJuZWeM5GZiZmZOBmZk5GZiZGU4GZmaGk4GZmeFkYGZmOBmYmRlOBmZmhpOBmZnhZGBmZjgZmJkZTgZmZoaTgZmZ4WRgZmY4GZiZGU4GZmaGk4GZmeFkYGZmlJgMJA2VNEfSPEkji2wfImmqpGpJRxSU95X0lKSnJU2XdGLBtt0kPZed88rmeTlmZtYUDSYDSZ2Aa4EDgO2BYyVtW2u3+cBw4LZa5YuBPSNiN2APYJSkTbNtvwC+ExH9gf6SDmj6y8hfVVVV3iE0qC3ECI6zuTnO5tVW4mysUmoGuwPPR8T8iKgGJgCHFu4QEQsiYgYQtco/zo4BWAcQQJYQekbEk9m2W4DDmv4y8tcW/oO0hRjBcTY3x9m82kqcjVVKMugNLCxYX5SVlURSH0nPkmoPYyNiSXb8oqae08zMmlcpyUBFyqJIWVERsSgidgb6AcdL2vjTntPMzJqXIup/D5Y0GKiMiKHZ+iggImJskX3HA/dHxD11nOtGYBLwD2ByRGyXlQ8DvhgRJxU5xknCzKwJIqLYB++iOpewzxSgn6QtgFeBYcCx9ey/6uKSegNvRMQHktYH9gIui4glkpZJ2j07/7eAq4udrDEvxszMmqbBZqKIWAGcDDwEzAQmRMRsSaMlHQwgaZCkhcBRwPWSpmeHbwc8IWkaMBn4WUTMyraNAMYB80gd1A825wszM7PSNdhMZGZm7V/ZjkBuaKBbuZD0sqRnJU2T9GTDR7QOSeMkLZX0XEHZ+pIekjRX0p8krZtnjFlMxeI8X9KibLDi05KG5hljFlMfSY9KmpUNoDw1Ky+be1okxlOy8rK6n5K6SXoi+5uZLun8rHxLSY9n9/IOSaU0Y+cR53hJL2blT0vaKc84a0jqlMUzMVtv3P2MiLL7ISWpfwFbAF2AZ4Bt846rjlhfBNbPO44ice0N7AI8V1A2Fjg7Wx4JjCnTOM8HTs87tlpxbgrski33AOYC25bTPa0nxnK8n92z32sBj5MGpd4JHJ2V/wI4sUzjHA8ckXdsRWI9DfgNMDFbb9T9LNeaQYMD3cqIKMMaVkQ8BrxZq/hQ4OZs+WbKYKBfHXFC8cePcxMRSyLimWz5XWA20Icyuqd1xFgzfqfc7uf72WI30oMsAewD3J2V3wwcnkNoqykS58psvazup6Q+wEHArwuKv0Qj7mfZvYllPtVAt1YWwJ8kTZH03byDacAmEbEU0hsHsHHO8dTnB5KekfTrcmjOKiRpS1Jt5nHgs+V4TwtifCIrKqv7mTVpTAOWAA8DLwBvRUTNm+0iYLO84qtRO86ImJJtujC7n5dL6pJjiDV+DpxFNl5L0obAm425n+WaDNrSoLQvRMQgUlb+gaS98w6oHbgO+FxE7EL6I7wi53hWkdQDuAv4Yfbpu+z+XxaJsezuZ0SsjIhdSbWr3UlPHq6xW+tGVSSAWnFKGgCMijRG6v8BG5KaB3Mj6SvA0qxWWPPeKdZ8H633fpZrMlgE9C1Y70Oa9K7sZJ8GiYjXgXtJ/7HL1VJJn4VV80O9lnM8RUXE65E1dAK/Iv3R5S7rgLsLuDUi7suKy+qeFouxXO8nQEQsA/4CDAbWyybGhDL7my+Ic2hBTbCa1H+Q99/8XsAhkl4E7iA1D10JrNuY+1muyWDVQDdJXUkD3SbmHNMaJHXPPoUh6TPAl4EZ+Ua1mtqfDiYCx2fLw4H7ah+Qk9Xi1Ccz2wIcQfnc0xuBWRFxVUFZud3TNWIst/spaaOapipJ6wD7AbNIY5GOznbL/V7WEeecmvspSaQ+olzvZ0ScGxF9I2Jr0nvloxFxHI28n2U7ziB7/O0qUsIaFxFjcg5pDZK2ItUGgtS5dFu5xCnpdqCCVI1dSnqi5PfA74DNgQWkJw3eyitGqDPOfUjt3SuBl0lPQSzNKUQAJO0F/BWYTvr3DuBc4Engt5TBPa0nxq9TRvdT0o6kDs1O2c+dEXFR9vc0AVgfmAYcF5/MelxOcT4CbET6APMM8P2CjuZcSfoicEZEHNLY+1m2ycDMzFpPuTYTmZlZK3IyMDMzJwMzM3MyMDMznAzMzAwnAzMzw8nAzMxwMjAzM+D/AzuKPVMqSxEeAAAAAElFTkSuQmCC",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x1089a5f98>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "plt.plot(params, test_scores)\n",
    "plt.title(\"n_estimator vs CV Error\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可见，前一个版本中，ridge最优结果也就是0.135；而这里，我们使用25个小ridge分类器的bagging，达到了低于0.132的结果。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "当然了，你如果并没有提前测试过ridge模型，你也可以用Bagging自带的DecisionTree模型：\n",
    "\n",
    "代码是一样的，把base_estimator给删去即可"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 106,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "params = [10, 15, 20, 25, 30, 40, 50, 60, 70, 100]\n",
    "test_scores = []\n",
    "for param in params:\n",
    "    clf = BaggingRegressor(n_estimators=param)\n",
    "    test_score = np.sqrt(-cross_val_score(clf, X_train, y_train, cv=10, scoring='neg_mean_squared_error'))\n",
    "    test_scores.append(np.mean(test_score))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 107,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAEKCAYAAAD5MJl4AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XmYVOWZ/vHvjSCKoLihAhE1KOASF5C4xjbRiMZlNBpxJaNO8htNjJo4KjGKcY9LomMcl1FHowYmxrgmbqNtjDEG9w0ENwRF3FBwQ4Tn98d7Worugq7urq5TVX1/rquuqrP0OU8tfZ7zLuc9igjMzMwKdcs7ADMzqz5ODmZm1oKTg5mZteDkYGZmLTg5mJlZC04OZmbWgpODmZm14ORgVU3SnyUdknccZl2Nk4NVDUmnSrqucF5E7BYRv+uEfQ2StFBSVf4PSDpQ0kRJcyW9IelOSdtKGi3p1SLrLyNplqTdiiwbI+kLSXOyx9zsec3KvBurRVX5j2FWAQIie277H0vLlDecxbZ9HHAhcAbQD1gbuBTYE/gTsJKkbzT7s12BhcBdS9js3yNixezRJ3t+q8i+W7yv9rzXak26Vjp/gbZUkl6V9FNJT0uaLen3kpYt4e92l/Rk9jd/k7RJwbITJM3Izl4nSdpR0i7AWGD/7Mz2yWzdByQdlr0ek23rwmy7L0naOpv/uqS3JB1asJ/dJD0h6UNJ0ySdWhDig9nzB1kcX1dysqTXsm39j6QVs201lTQOkzQN+L8i7/mFwjP37Gz+HUmbSeop6XeS3s1if1TS6kW2sSJwGnBkRNwaEZ9GxIKIuDMiToiIecAfgEOb/ekhwA0RsbC176bIPl+V9B+SngY+yuJuPq+bpGHZ9zFb0rOS9ijYxjWSLs1KOHOBhrbGYVUmIvzwY4kP4FXgH8AaQF/gBeAHrfzNFsAsYATpzPyQbDs9gA2A14E1snXXBtbNXp8KXNdsWw8Ah2WvxwCfkw6MAk4HpgH/mW17Z2AO0Ctb/xvARtnrjYGZwJ7Z9CBgAaCCfR0GTMmW9QL+2BRPNm8h8D/A8kDPIu/7ZOD6gunvAC9kr38A3Ar0zGLfHOhdZBu7ZO+x21I+322AD5piAFYEPgE2WcL6Y4C/tvIdPwH0L9jmYvOA7sBU4ITs9Y7ZZ71+tv41wGxgq2x62bx/u3507OGSg5XiooiYFREfALcDm7Wy/hHAZRHxWCS/A+YBW5EOyMsCG0vqHhGvR0SLOvSleDUirot0BJoADAROi4j5EXEv6cA6GCAi/hoRz2evnwPGAzs0215htdKBwIURMS0iPgFOAkYXVJEEcGqks/l5RWL7PbCnpOWy6QOAG7PX84FVgQ2yz+TJiPioyDZWBd6NpZQAIuLvpOS7dzZrf+DFiHh2SX8DbC3p/ewxW9LUZssviog3m72vwnlbAStExLkR8UVEPADckb3HJrdGxD+yGD9fSixWA5wcrBSzCl5/AvRuZf1BwE8LD0akg3j/iHgZOAYYB8ySdGMbG0YLY/kUICLebTavN0BWVXS/pLclfQD8EFhtKdvuTyqJNJlGOkteo2DejCX9cfbeXgD2kLQ8qY2gKTn8DrgbGJ9VqZ2zhLr894DVSqiz/x2LqpYOBq5tZf1HImKV7LFyRKzfbHmx91U4rz8wvdnyacCAgunmy62GOTlYZ5gOnNnsYNQ7IiYARMT4iNielEQAzs2eyz1+/A3ALcCAiOgLXM6ikkKxfb1ZEBPZ6/ksnpBai3E8qQSyF/B8RLwCkJ1tnx4RG5GqhfagZbsBwCPAZ8C/tLKf64BvSdoK+DqLklB7FXtfhfPeBL7SbPnawButbMNqlJODdYYrgf8naSSApBWyxuEVJG2QNUAvS6oC+pRU1QTpILyOpLb0IFraur2B2RExP4vlwIJl75DaEL5aMO/3wLGS1pHUGzgTGF9QxVNKXOOBbwP/TsEBW1KDpI2zEsFHpKSzoPkfR8QcUtvLbyXtJWl5Sd0ljZJ0TsF6rwMPZzHfGxFvtxJXu3plFXgU+DhrpO4uqQHYPdu/1aGSkkP2w5wsaYqkE4os317S45LmS9qnyPI+WVH64oJ5B0h6RtJTShc6rdKxt2KdpM1ngxHxOPBvwCWS3ic18o7JFvcEziEdnN8EVif1UoLUC0fAe5IeK3H/zZcXTh8JnC7pQ1Jj8YSCGD8lHfwfzqq+RgJXk6pr/gq8TKpCO3op+2oZTOoe+gipjn5CwaI1gZuAD4HnSQ3t1y9hG78GjstifpvUgH8UqRRU6FrS2XtrVUoAW6nldQ7Dl/K+FpsXEfNJ1WS7Ae8ClwCHRMTUYutb7VNq11vKCulMZwrwLdI/80RgdERMLlhnbVKPiZ8Bt0XEzc228RtSXe/7EXF0Vtf6JjA0ImZLOhf4OCJ+Wb63ZmZm7VVKyWEkMDXrwTGfVGzeq3CFrMfJcxQ5e8jOTvoB9xTOzp77ZFUIK5KShZmZVYFSksMAFu+FMIPFeygsUXbgPx84noI6z4j4glTkfzbb3jDgqtJCtmog6aSC6onCx515x2ZmHVdKcijWkFVq/eKRwJ0R0dSjQQCSupMa7DaNiAGkJDG2+CasGkXE2bFoGIbCx3fyjs3MOq57CevMIDV6NRlI6VVAWwPbSToS6AP0yC6tvxkgIl7L1vtf0pWXLUhyQ5eZWTtERLt7qZVScpgIDM7GllkWGA3ctpT1C6uPDo6IdSJiPVJj9XURMZbUN3qYpFWzVXcGJi1pg3ldPr6kx6mnnpp7DI6pvuJyTI6p3I+OajU5RMQC4EekBuXnSf2+J0k6TdLuAJJGSJoO7AtcJmlpl/ETETNJg4s9JOkpYFPgrI69FTMzK5dSqpWIiLuAIc3mnVrw+jFaXj3ZfBvXUtAfOyKuAK5oS7BmZlYZvkK6HRoaGvIOoQXHVLpqjMsxlcYxVU6rF8HlTVJUe4xmZtVGEtHJDdJmZtbF1ERyeOON1tcxM7PyqYnkcMghsKDF+JVmZtZZaiI5LFwI557b+npmZlYeNdEgPX16MHw43HILbL113hGZmVW/LtEgPXAgXHEFHHggfPBB3tGYmdW/mig5NMX4ox/B22/DhAnQpnuFmZl1MV2i5NDkvPNg8mS4+uq8IzEzq281VXIAeOEF2GEH+OtfYdiwHAMzM6tiXarkALDhhnDWWTB6NHz2Wd7RmJnVp5orOQBEwPe+B2utBRdfnFNgZmZVrMuVHCA1Rl9xBdx2G9x+e97RmJnVn5osOTT5+99hn33g8cdhQEl3tTYz6xq6ZMmhyTbbpO6tBx/s4TXMzMqpppMDwEknpedzzsk3DjOzelLT1UpN3ngDhg+Hm29OpQkzs66uS1crNRkwIDVQH3SQh9cwMyuHuig5NPnxj2HWLA+vYWbmkkOB886DF1+Eq67KOxIzs9pWVyUHgEmT4Bvf8PAaZta1ueTQzLBhcPbZHl7DzKwj6q7kAGl4jf33hzXWgP/8z04KzMysirnkUETT8Bp33JGG2DAzs7apy5JDEw+vYWZdlUsOS7HNNql760EHeXgNM7O2qOuSA6SksNNOMHQofOc7aZjvtdaCfv2ge/cyBmpmVkU6WnKo++QA8OabMG4cTJ8Ob70FM2fCe+/BKqssShZrrrnodfN5vXqV572YmVWKk0M7ffEFvP32omQxc+birwune/ZcPFmceCJsumnZQzIzKxsnh04WkcZrakoUV1yRqqjGjcstJDOzVnU0ObjWvRUSrLxyegwblhKEu8eaWb2r695KnWHoUJg8Oe8ozMw6l6uV2uijj1JPp7lzYZll8o7GzKw4X+dQYb17w2qrwbRpeUdiZtZ5nBzawVVLZlbvnBzawcnBzOqdk0M7ODmYWb1zcmiHYcPSTYXMzOqVk0M7uORgZvXOyaEd1lwTPv8c3n0370jMzDpHSclB0ihJkyVNkXRCkeXbS3pc0nxJ+xRZ3kfSDEkXF8zrIelySS9KekHS3h17K5Ujpaollx7MrF61mhwkdQMuAXYBNgIOkDS02WrTgDHADUvYzOlAY7N5PwdmRcSQiNgQeLANcefOVUtmVs9KGVtpJDA1IqYBSBoP7AV8eWiMiNezZS0uZZY0HOgH3AWMKFh0GDCkYBvvtyP+3Dg5mFk9K6VaaQAwvWB6RjavVZIEnA8cD6hg/krZyzOy6qgJklYvLeTq4GolM6tnpZQcio3NUepgR0cCd0bEGylPfLmt7sBA4KGI+KmkY4ELgEOLbWRcwfjYDQ0NNDQ0lLj7zjN0qLuzmln1aGxspLGxsWzba3XgPUlbAeMiYlQ2fSIQEXFukXWvAW6PiJuz6euB7YCFQB+gB3BpRIyVNDci+mTrDQT+EhGbFNlmVQ2812T+fOjTJ93rYbnl8o7GzGxxlRh4byIwWNIgScsCo4Gl3dHgy2Ai4uCIWCci1gN+BlwXEWOzxbdL2jF7vRPwQtvDz0+PHrDuujB1at6RmJmVX6vJISIWAD8C7gGeB8ZHxCRJp0naHUDSCEnTgX2ByyQ9W8K+TwTGSXoKOAj4aXvfRF58pbSZ1Svfz6EDxo5NVUqnnJJ3JGZmi/P9HHLk7qxmVq+cHDrA1UpmVq9crdQBc+bAWmulW4Z2c5o1syriaqUcrbgi9O0L06e3vq6ZWS1xcuggXyltZvXIyaGDfKW0mdUjJ4cOco8lM6tHTg4d5ORgZvXIyaGD3J3VzOqRk0MH9e8Pn3wC79fU3SjMzJbOyaGDpFS19OKLeUdiZlY+Tg5l4O6sZlZvnBzKwN1ZzazeODmUgXssmVm9cXIoA1crmVm98cB7ZfD552mcpQ8/hJ49847GzMwD71WFZZeFQYPgpZfyjsTMrDycHMrE7Q5mVk+cHMrE7Q5mVk+cHMrE3VnNrJ44OZSJq5XMrJ64t1KZfPABfOUr6dahanf/ADOz8nBvpSrRty/07g0zZuQdiZlZxzk5lJGrlsysXjg5lJF7LJlZvXByKCP3WDKzeuHkUEauVjKzeuHkUEZODmZWL5wcymjgwNSV9cMP847EzKxjnBzKqFs3GDLEpQczq31ODmXmqiUzqwdODmXm7qxmVg+cHMrM3VnNrB44OZSZq5XMrB544L0ymzcPVloJ5s6FHj3yjsbMuioPvFdlevZMo7O+/HLekZiZtZ+TQydwu4OZ1Tonh07gdgczq3VODp3A3VnNrNY5OXQCVyuZWa0rKTlIGiVpsqQpkk4osnx7SY9Lmi9pnyLL+0iaIeniIstuk/RM+8KvTk3VSjXUycrMbDGtJgdJ3YBLgF2AjYADJA1ttto0YAxwwxI2czrQWGTbewNz2hBvTVhlFVh+eZg5M+9IzMzap5SSw0hgakRMi4j5wHhgr8IVIuL1iHgOaHGuLGk40A+4p9n8FYBjgTPaGXtVc9WSmdWyUpLDAGB6wfSMbF6rJAk4HzgeaH4xxunZsk9L2VatcY8lM6tlpSSHYlfYlVqbfiRwZ0S8sdgGpU2BwRFxW7b9dl/FV63cY8nMaln3EtaZAaxdMD0QeLPE7W8NbCfpSKAP0EPSR8DrwBaSXgF6AP0k3R8R3yy2kXHjxn35uqGhgYaGhhJ3n5+hQ+HOO/OOwsy6isbGRhobG8u2vVbHVpK0DPAi8C1gJvBP4ICIaFGjLuka4I6I+GORZWOA4RFxdLP5g4DbI+JrS9h/TY2t1OS112C77WDGjLwjMbOuqNPHVoqIBcCPSA3KzwPjI2KSpNMk7Z4FMULSdGBf4DJJz7Y3oHqx9towe3YagM/MrNZ4VNZOtPnmcOWVMGJE3pGYWVfjUVmrmLuzmlmtcnLoRO7Oama1ysmhE7k7q5nVKieHTuSSg5nVKjdId6JPP4WVV4aPPoLupVxRYmZWJm6QrmLLLw/9+8Mrr+QdiZlZ2zg5dDK3O5hZLXJy6GTuzmpmtcjJoZO5UdrMapGTQydztZKZ1SInh07WVK1Uox2uzKyLcnLoZKutlrqxzpqVdyRmZqVzcqgAVy2ZWa1xcqgAN0qbWa1xcqgAd2c1s1rj5FABLjmYWa1xcqgAtzmYWa3xwHsVsGAB9O4N77yTns3MOpsH3qsByywD668PU6bkHYmZWWmcHCrEVUtmVkucHCrEjdJmVkucHCrE3VnNrJY4OVSIq5XMrJa4t1KFfPIJrLpqumXoMsvkHY2Z1Tv3VqoRvXrBGmvAq6/mHYmZWeucHCrIjdJmViucHCrI7Q5mViucHCrIJQczqxVODhXk7qxmViu65x1AVzJs2KJbhqrdfQiq2/z5MHcuzJkDyy+fGuHNrPY4OVTQ6qun53ffXfQ6bxHw2WfpYD537qJHe6e/+AJWXBH69Enz9t4bTj4Z1lsv73dqZm3h5FBB0qKqpXImh48/hkcfXfyA3ZaDe48e6WDe9Gg6uDefHjBg6cv79IHllltUKpo9G379a9hyS/jud+HnP4dBg8r3vs2s8/giuAo7/HD4+tfhBz8oz/ZmzoRdd4WePWHNNVs/eBeb7tGjPLEsyXvvwQUXwOWXw/77w9ixMHBg5+7TrKvr6EVwLjlUWDl7LE2ZAqNGwRFHwEknVW87xqqrwllnwbHHwnnnwde+BgcfDCeeCP375x2dmRXj3koVVq4eSxMnwg47pKqasWOrNzEUWn11+NWv0vvv3h023hiOOw5mzco7MjNrzsmhwspRcrj7bthtt1RNc/jh5YmrktZYAy68EJ5/Pt0lb9gw+I//SHfKM7Pq4ORQYeuuC2+9lQbia4/rr4dDD4VbboE99yxvbJW21lpw0UXwzDOpUX3IkFQ99t57eUdmZk4OFda9O3z1qzB1atv/9oILUhXSAw/AttuWP7a8DBwIv/0tPPUUvP8+bLAB/OIXqbeTmeXDySEHbW13WLgQfvYzuPpqePhh2HDDzostT2uvnarKHnsM3nwz3Xf7tNPgww/zjsys63FyyEFbBuCbPx/GjIF//AMeegi+8pXOja0arLsuXHVVes+vvAKDB8OZZ6ZrMsysMpwcclBqo/RHH8Eee6Qz53vugVVW6fzYqsngwXDttfC3v8ELL6TquHPPTZ+LmXWukpKDpFGSJkuaIumEIsu3l/S4pPmS9imyvI+kGZIuzqaXl3SHpEmSnpV0VsffSu0opVrpnXfgm99M9fE335xuFtRVDRkCN9wAjY3w5JMpaVxwQfsb9c2sda0mB0ndgEuAXYCNgAMkDW222jRgDHDDEjZzOtDYbN55ETEM2BzYTtIubYi7pg0ZkhqkFywovvzVV1OD8y67wJVXpkZsS20t48fDvffCI4+kJHHRRfDpp3lHZlZ/Sik5jASmRsS0iJgPjAf2KlwhIl6PiOeAFuNcSBoO9APuKVj/04h4MHv9BfAE0GUGVOjdG1ZbDV5/veWyp5+G7beHn/wETj+9Ni5uq7RNNoGbboI//zn13Fp//dTbad68vCMzqx+lJIcBwPSC6RnZvFZJEnA+cDxQ9DAnqS+wB/B/pWyzXhRrd2hshJ13ht/8Bo46Kpewaspmm6XrPW65Bf7yl5QkLr8cPv8878jMal8pFRbFDuqljoR3JHBnRLyR8sTi25K0DHAj8JuIeG1JGxk3btyXrxsaGmhoaChx99Wrqd1h113T9E03wZFHwoQJsOOO+cZWa0aMgDvuSCPTnnoqnH12uk7i0EM7f1BBs2rR2NhIY2Nj2bbX6qiskrYCxkXEqGz6RCAi4twi614D3B4RN2fT1wPbAQuBPkAP4NKIGJstvwqYExHHLmX/dTUqa5P/+q/UuHrFFalK5Oyz0wFus83yjqz2PfxwShKvvgqnnAIHHeR2G+t6OjoqaynVShOBwZIGSVoWGA3ctrSYml5ExMERsU5ErAf8DLiuIDGcAay4tMRQz5pKDr/4RWpUfeghJ4Zy2XZbuO++dNHg1VenhuwbblhyBwAza6mk+zlIGgVcREomV0XEOZJOAyZGxB2SRgB/AvoCnwFvRcQmzbYxBhgeEUdLamrHmAR8TqqmuiQiri6y77osOcycmYar3nJLuPPO6rkzXL2JSI3Wp5ySxmz65S9hv/3yjsqs83W05OCb/eQkAi69NF393Lt33tHUv4jUBfaHP0xXWx94YN4RmXUuJwezNnjqqdQj7MEH63eMKjOoTJuDWd3YbLM0BMe++3oYDrOlccnBuqTDD0/Db9x4oy80tPrkkoNZO1xySeotdumleUdiVp1ccrAu66WXYJtt0vUlI0fmHY1ZebnkYNZOgwen4Ta+9z3fmtSsOZccrMs7/nh47rl0vUk3ny5ZnXDJwayDzjor9Vw688y8IzGrHi45mJHuWT1iBFx3Hey0U97RmHWcSw5mZdC/fxp/6ZBDYMaMvKMxy5+Tg1lmxx3h6KNTA/X8+XlHY5YvVyuZFVi4EPbaK/Vk+vWv847GrP1crWRWRt26pXaHW2+FP/wh72jM8uOSg1kRjz8Oo0bB3/4GQ4bkHY1Z27nkYNYJhg+HM85IA/R9/HHe0ZhVnksOZksQke63AXDttR6gz2qLSw5mnURK9/p+4gm48sq8ozGrLJcczFrx4ouw3XZw112pusmsFrjkYNbJhgxJQ3vvtx/Mnp13NGaV4ZKDWYmOOQZefjl1c/UAfVbtfA9pswr5/HNoaIA994QTT8w7GrPks89SifaDD9Lz7NkwZw4ceGDHkkP3cgZpVs+WXRYmTIAtt4SttkqJwqyjFi5MB/Omg3vhQb75vGLLImDllaFv38WfO8olB7M2uvfe1MX18cdhrbXyjsaqwbx57T+4z5kDvXu3PLg3PRebV7hsueWKd7N2tZJZDn75S7jvPrj/fuju8nfNi4C5c0s/uDdfZ/780g7kxZattFLn/IacHMxysHAh7LYbfO1r8Ktf5R2NQTpAt3aWvqRlH34Iyy/fvoP7yitDr17Vd5Gkk4NZTt59N133cNFF8C//knc0tS8i3ZGvLQf3wnXmzSu9Kqb5vJVWgh498v4EysvJwSxHjz4Ke+wBjzwCX/1q3tHkb/78dBbe3jP4nj3bXz3Tu3f1nb3nycnBLGeXXAJXXQV//3uqmqgHH30Er73W9jP4Tz9NZ+HtOYPv2zf1CLPycHIwy1kEHHggrLAC/Pd/5x1Nx3z8MVx8MVx4Iay++qKDeKkH+d69fYFgtehocnA/C7MOktLAfFtuCddcA//6r3lH1HaffQaXXw7nnAPf+AY89BAMHZp3VJYnJwezMujdG/74R9hhB9hiC9h007wjKs38+SmhnX46bL55GlywVmK3zuUCoFmZbLhh6rm0776pUbaaLVgA118Pw4bB//5vuiXqbbc5MdgibnMwK7OjjoK33oKbbqq+3jMR8Kc/wSmnwIorwplnwo475h2VdQY3SJtVmXnzYPvtYfRoOO64vKNJIuDuu+Hkk9MFfGecAbvuWn3Jy8rHycGsCk2bBiNHpnaI7bbLN5YHH0xJ4b330rAf++zjHkVdgW/2Y1aFBg1KDb2jR8OsWfnEMHEifPvbqffUv/0bPPtsag9xYrBS+Gdi1kl22y0dmA88MDUAV8qzz6bhPPbeG777XZg8GQ49FJZZpnIxWO1zcjDrROPGpXr9U0/t/H1NmQIHHAA775y61E6dCj/8oa86tvZxcjDrRMssAzfeCNdeC3fe2Tn7mDYNDj8cttkGNt4YXnoJjj22fobysHw4OZh1sn79YPx4OOywNF5Rubz1Fvz4x+miu7XWSiWFn/88XZBn1lFODmYVsO226b7T++2Xurp2xHvvwQknwEYbpWGmJ01KXVPLcWtIsyYlJQdJoyRNljRF0glFlm8v6XFJ8yXtU2R5H0kzJF1cMG8LSc9k2/xNx96GWfU75pjUi+nYY9v393PmpDaMIUPSFdhPP50GyOvXr6xhmgElJAdJ3YBLgF2AjYADJDUfkmsaMAa4YQmbOR1obDbvv4AjImIDYANJu7QhbrOaI8HVV6fbi96wpP+UIj75JN1tbvBgePVV+Oc/4bLLYODAzovVrJSSw0hgakRMi4j5wHhgr8IVIuL1iHgOaHG1mqThQD/gnoJ5awJ9IuKf2azrAN9Ly+reiiumC+OOOQaef37p686bl+4VMXhwumahsTE1bK+3XkVCtS6ulOQwAJheMD0jm9cqSQLOB44HCq/UG5Btp83bNKt1m2wC55+frkGYO7fl8i++SCWMIUPgL3+BO+5IA+NtuGHlY7Wuq5Qhu4tdfl3qeBZHAndGxBtafBCXNm1z3LhxX75uaGigoaGhxN2bVacxY+Dhh+GII1JPJimNeTRhQromYsCA1AV2m23yjtRqRWNjI42NjWXbXqtjK0naChgXEaOy6ROBiIhzi6x7DXB7RNycTV8PbAcsBPoAPYBLgYuBByJiWLbeaGCHiPj3Itv02EpWlz77LB38v//91FD9i19Ar15ppNRvftOD4lnHVOJOcBOBwZIGATOB0cABS4up6UVEHFwQ6BhgeESMzabnSBqZbf9QUsIw6zKWWy4N6z18OKy9dkoKu+/upGDVodXkEBELJP2I1KDcDbgqIiZJOg2YGBF3SBoB/AnoC+wuaVxEbNLKpo8E/gdYDvhzRNzVkTdiVovWWy9dvLbKKh4Qz6qLh+w2M6tDHrLbzMzKzsnBzMxacHIwM7MWnBzMzKwFJwczM2vBycHMzFpwcjAzsxacHMzMrAUnBzMza8HJwczMWnByMDOzFpwczMysBScHMzNrwcnBzMxacHIwM7MWnBzMzKwFJ4d2KOdNvMvFMZWuGuNyTKVxTJXj5NAO1fhjcEylq8a4HFNpHFPlODmYmVkLTg5mZtaCIiLvGJZKUnUHaGZWpSJC7f3bqk8OZmZWea5WMjOzFpwczMyshapKDpKukjRL0jMF81aWdI+kFyXdLWmlCsYzUNL9kl6Q9Kyko/OOKdt/T0mPSnoyi+vUbP46kv6RxfV7Sd0rHFc3SU9Iuq0a4slieE3S09ln9c9sXt7f30qS/iBpkqTnJX0959/5Btnn80T2/KGko6vgczpW0nOSnpF0g6Rlq+Q39ZPs/y63Y0Jbj5WSLpY0VdJTkjYrZR9VlRyAa4Bdms07EbgvIoYA9wMnVTCeL4DjImJDYGvgKElDc46JiJgH7BgRmwObAbtK+jpwLnBBFtcHwOGVjAv4CfBCwXTe8QAsBBpznCpWAAAD5klEQVQiYvOIGJnNy/X7Ay4C/hwRw4BNgcl5xhQRU7LPZwtgOPAx8Kc8Y5LUH/gxsEVEfA3oDhxAzr8pSRtl+xxB+t/bXdJgKv9ZlXyslLQr8NWIWB/4IXBZSXuIiKp6AIOAZwqmJwNrZK/XBCbnGNstwE5VFlMv4DFgJPA20C2bvxVwVwXjGAjcCzQAt2Xz3skrnoK4XgVWbTYvt+8P6AO8XGR+VfymgG8DD+UdE9AfmAasTEoMtwE75/kbz/a5L3BFwfTJwPHApEp/ViUcKydlry8D9i9Y78tYl/aotpJDMf0iYhZARLwFrJ5HEJLWIZ0p/IP0weYaU1aF8yTwFumg/DLwQUQszFaZQfoHq5Rfk/5JIotvVWB2jvE0CeBuSRMlHZHNy/P7Ww94V9I1WTXOFZJ65RxTof2BG7PXucUUEW8CFwCvA28AHwJPkO9vHOA54BtZFU4vYDfgK1TH99f8WNkvmz8AmF6w3hvZvKWqheSQO0m9gZuAn0TER2QHwDxFxMJI1UoDSaWGYcVWq0Qskr4DzIqIp4CmftUqeF3ReJrZJiJGkP6Jj5K0fU5xNOkObAH8NlI1zsek6oDcf1OSegB7An/IZuUWk6S+wF6ks+P+wArArkVWrWiMETGZVLV1H/Bn4ClS9XM1K3atQ6ufWy0kh1mS1gCQtCapWFkxWYPXTcDvIuLWaoipUETMAR4kFbH7Smr6TgcCb1YojG2BPSW9Avwe+CbwG2ClnOL5UnYGRUS8Q6oWHEm+398MYHpEPJZN/5GULKrhN7Ur8HhEvJtN5xnTTsArEfF+RCwgtYFsQ36/8S9FxDURMTwiGoDZwBSq4/tbUgwzSKWbJiV9btWYHJqfcd4GfD97PQa4tfkfdLKrgRci4qJqiUnSak09ESQtT/pHegF4ANiv0nFFxNiIWDsi1gNGA/dHxMF5xdNEUq+s1IekFUj16c+S4/eXFfunS9ogm/Ut4Pk8YypwACm5N8kzpteBrSQtJ0ks+pxy/U0BSFo9e14b2Jv0meXxWS3tWPn9ghhuAw4FkLQVqWpuVqtbr2RjTgkNLDeSMto80o/jX0kNUvcBL5Lq1vtWMJ5tgQWkouOTpDrPUcAqecWUxbVJFstTwDPAz7P56wKPks5kJgA9cvgOd2BRg3Su8WT7b/rungVOzObn/f1tCkzMYrsZWKkKYlqe1IGgT8G8vGM6ldR4+gxwLdAj799UFtdfSW0PT5J6wlX8s2rrsRK4BHgJeJrUA6zVfXj4DDMza6Eaq5XMzCxnTg5mZtaCk4OZmbXg5GBmZi04OZiZWQtODmZm1oKTg5mZteDkYGZmLfx/18yVEzk5UhMAAAAASUVORK5CYII=",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x1089a1cf8>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "plt.plot(params, test_scores)\n",
    "plt.title(\"n_estimator vs CV Error\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "咦，看来单纯用DT不太灵光的。最好的结果也就0.140"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Boosting\n",
    "\n",
    "Boosting比Bagging理论上更高级点，它也是揽来一把的分类器。但是把他们线性排列。下一个分类器把上一个分类器分类得不好的地方加上更高的权重，这样下一个分类器就能在这个部分学得更加“深刻”。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 97,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from sklearn.ensemble import AdaBoostRegressor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "params = [10, 15, 20, 25, 30, 35, 40, 45, 50]\n",
    "test_scores = []\n",
    "for param in params:\n",
    "    clf = BaggingRegressor(n_estimators=param, base_estimator=ridge)\n",
    "    test_score = np.sqrt(-cross_val_score(clf, X_train, y_train, cv=10, scoring='neg_mean_squared_error'))\n",
    "    test_scores.append(np.mean(test_score))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYoAAAEKCAYAAAAMzhLIAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xnc1XP6+PHX1aqFJCqVCjFCIkmLJWsxKbLmh+xmhmEwY58pw4xtCsNgBt2SpShL1ix18zWTtJdSoVS3Fi1KlLRcvz+uz63jdO5zn/vc55zP55z7ej4e5+Gcz/ks1/nkPtd576KqOOecc2WpFnYAzjnnos0ThXPOuaQ8UTjnnEvKE4VzzrmkPFE455xLyhOFc865pDxROOecS8oThctbIvKmiJwfdhzOFTpPFC4viMgAEXk6dpuqnqyqw7JwrVYislVEIvn3ISLnishEEVknIl+LyBsi0k1EzhGRBQn2ry4iy0Xk5ATv9ReRzSLyXfBYF/y3aW4+jcsHkfxDcC5kAmjw34ofLFI9s+H84tzXAYOBO4HGQEvgEaA38DLQQESOijvsJGAr8HYZp/2fqu4UPHYM/rsswbW3+1zpfNaoJmBXNv8Hc2kTkQUicr2ITBeRb0XkeRGplcJxvURkanDMRyLSLua9G0WkJPhV+5mIHCMiPYBbgLODX7xTg33HicjFwfP+wbkGB+f9QkS6BNsXicgyEbkg5joni8gUEVkrIgtFZEBMiB8E/10TxHG4mNtE5KvgXE+JyE7BuUpLIBeLyELg/QSfeXbsL/rgV/4KETlYRGqLyDARWRnEPkFEdktwjp2A24HfqeqrqrpBVbeo6huqeqOqbgReBC6IO/R84FlV3Vrev02Cay4QkRtEZDrwfRB3/LZqItI2+Pf4VkRmisgpMecoEpFHgpLPOqB7ReNwIVNVf/gjrQewAPgYaALsDMwGLi/nmA7AcqAj9ov9/OA8NYF9gUVAk2DflsCewfMBwNNx5xoHXBw87w/8hH1JCnAHsBB4KDj3CcB3QN1g/6OAA4LnBwJLgd7B61bAFkBirnUxMC94ry4wqjSeYNtW4CmgDlA7wee+DXgm5vWvgdnB88uBV4HaQeyHAPUTnKNH8BmrJbm/XYE1pTEAOwHrgXZl7N8f+LCcf+MpQLOYc/5iG1AD+By4MXh+THCv9wn2LwK+BToHr2uF/f+uPyr28BKFq6wHVXW5qq4BXgMOLmf/S4HHVHWSmmHARqAz9uVcCzhQRGqo6iJV3a7OPYkFqvq02rfRCKAFcLuqblLVd7Ev2TYAqvqhqs4Knn8KDAeOjjtfbNXTucBgVV2oquuBm4FzYqpRFBig9it/Y4LYngd6i8gOwet+wHPB801AI2Df4J5MVdXvE5yjEbBSk5QMVPV/WCI+Ldh0NjBXVWeWdQzQRURWB49vReTzuPcfVNUlcZ8rdltnoJ6q3qOqm1V1HPB68BlLvaqqHwcx/pQkFhdBnihcZS2Peb4eqF/O/q2A62O/mLAv9Gaq+iXwB2AgsFxEnqtgo2psLBsAVHVl3Lb6AEF10lgR+UZE1gBXALsmOXczrIRSaiH267lJzLaSsg4OPtts4BQRqYO1KZQmimHAGGB4UO12dxl1/6uAXVOo4x/Gtuqn84Ch5ew/XlV3CR4NVXWfuPcTfa7Ybc2AxXHvLwSax7yOf9/lEU8ULtcWA3+L+2Kqr6ojAFR1uKoeiSUUgHuC/2Z6PvxngVeA5qq6M/BvtpUgEl1rSUxMBM838cvkVF6Mw7GSSR9glqrOBwh+hd+hqgdgVUensH07A8B44Efg1HKu8zRwnIh0Bg5nW0JKV6LPFbttCbBH3Pstga/LOYfLE54oXK49DvxGRDoBiEi9oGG5nojsGzRe18KqiTZg1VFgX8itRaQiPZGS7Vsf+FZVNwWxnBvz3gqszWHvmG3PA9eKSGsRqQ/8DRgeUw2USlzDgROB3xLz5S0i3UXkwKCk8D2WgLbEH6yq32FtNf8SkT4iUkdEaohITxG5O2a/RcB/g5jfVdVvyokrrd5dMSYAPwQN3DVEpDvQK7i+KwApJYrgf8Q5IjJPRG5M8P6RIjJZRDaJSN+Y7S1FZFLQu2SmiFwR89644JxTg/d3Dbb3D6oDpgSPizPxQV1WVPhXoqpOBi4DHhaR1VgDcf/g7drA3dgX9RJgN6y3E1hvHgFWicikFK8f/37s698Bd4jIWqyheURMjBuwRPDfoHqsEzAEq9L5EPgSq2a7Osm1tg/GupyOx+r0R8S81RQYCawFZmGN9M+UcY77geuCmL/BGv+vxEpHsYZiv+rLq3YC6Czbj6M4NMnn+sU2Vd2EVaWdDKwEHgbOV9XPE+3v8o9Yu1+SHexXzjzgOOyPdyJwjqrOidmnJda74o/AaFV9KdheI7jGJhGpi/0RdFHVZSIyDrhOVafGXa8/cKiqxv4ROuecC0mNFPbpBHyuqgsBRGQ4Vsf6c6IIirqISPwvjc0xL+uwfRG3rBJNZYvCzjnnMiSVqqfm/LLHQgm/7M2QlIi0CAbmLATu0V+O+BwSVC/dFndYXxGZJiIviEiLVK/lokFEbo6pwoh9vBF2bM65ikslUST6dZ9ynaOqlqhqe6z/+oUxI07PDbYfCRwpIucF20cDrVX1YGyEayp1rC5CVPUu3TYVROzj12HH5pyruFSqnkqwRrFSLbC2igoJ2iVmYYnhJVVdGmz/QUSew6q4nlHVb2MOe5xt3SN/Ib6ayznnXGpUtULV+6mUKCYCbcTms6kFnIP96i/LzwGISPPSkagi0hDoBswN5otpFGyviXWl+zR4HTvAqg82SCmhMIayV/QxYMCA0GPwOD1Gj9PjLH2ko9wShapuEZGrgHewxPKkqn4mIrcDE1X1dRHpiM1cuTPQS0QGqmo7oC0wSES2YgnkXlWdFfSAGhP0iqoOvIeVHgCuFpHeWF/y1cCFaX0y55xzGZFK1ROq+jbwq7htA2KeT2L7kZmo6ntA+wTb12OTwiW61i1s6zvvnHMuZD4yO8u6d+8edggp8TgzJx9iBI8z0/IlznSUO+AuqkRE8zV255wLi4igWWjMds45V4V5onDOOZeUJwrnnHNJeaJwzjmXlCcK55xzSXmicM45l5QnCuecc0l5onDOOZeUJwrnnHNJeaJwzjmXlCcK55xzSXmicM45l5QnCuecc0l5onC88QasWRN2FM65qPJEUYWpwi23QK9e8OKLYUfjnIsqTxRV1E8/wfnnw7hxcOed8N//hh2Rcy6qUloK1RWWNWugb19o0ADefx/mz7fXzjmXiJcoqpjFi+GII+DAA2HkSKhbF/bfH1asgG++CTs651wUeaKoQqZPh65d4aKL4MEHoXp1216tGnTuDP/7X7jxOeeiyRNFFfHuu3DCCTBoEFx/PUjcirndunmicM4l5omiCnjqKTjvPBg1Cs46K/E+Xbt6g7ZzLjFR1bBjSIuIaL7GniuqcMcdUFQEb74JbduWve/330OTJrB6NdSunbsYnXO5JSKoqpS/5zZeoihQmzbBZZfBq6/C+PHJkwRA/fqw334weXJu4nPO5Q9PFAVo3Tro3RuWLoUPPoCmTVM7rls3r35yzm3PE0WBWboUjj4a9tjDShP166d+rDdoO+cS8URRQGbPhi5d4PTT4d//hhoVHE5Z2qDtTT/OuVieKArEBx/AMcdY4/Wtt27f/TUVe+wBO+wAX3yR+ficc/nLE0UBGD4czjwTnnvO5m+qDK9+cs7F80SRx1Th3nvhhhtszqbjjqv8OX08hXMunk8KmKe2bIGrr4b/+z8rAbRokZnzdusGjz2WmXM55wqDD7jLQ+vXQ79+8MMPNtq6QYPMnXvzZthlF1i4EBo2zNx5nXPR4APuqoBvvrFG6513ttHWmUwSYD2lDjvMBuk55xx4osgrn39ubQg9etj8TbVqZec63qDtnIvliSJPjB8PRx4JN90Ef/1ret1fU+UN2s65WCklChHpKSJzRGSeiNyY4P0jRWSyiGwSkb4x21uKyCQRmSIiM0Xkipj3xgXnnBq8v2uwvZaIDBeRz0VkvIi0LCuu1asr9mHz1csv25QcRUVw6aXZv16XLjBpks0X5Zxz5SYKEakGPAz0AA4A+onIfnG7LQT6A8/GbV8CdFHVDsDhwE0iEjvzUD9VPURVO6jqymDbJcBqVd0HeAC4t6zY9trLps1+6y3rBVSI/vlPuOoqePttOOmk3FyzQQPYc09b6Mg551IpUXQCPlfVhaq6CRgO9IndQVUXqeqngMZt3xwcA1AHiK8wSXT9PsDQ4PlIoMzRAQsWwLHHwsCB0LIl3HwzzJ2bwifKA1u32gJDjz5q1UCHHprb63v1k3OuVCqJojmwOOZ1SbAtJSLSQkSmY6WOe1R1WczbQ4Jqp9sSXU9VtwBrRGSXROdu2BB+8xuYMAHeece6dh59tDXGPvEEfPddqlFGy48/wjnnWPXP//4HrVvnPgafSdY5VyqVAXeJmk1THsCgqiVA+6DK6VURGamqK4BzVXWpiNQDXhKR81T1mQTXk7KuN3DgwJ+fd+/enfvu687f/27VNEVF8Mc/Wt3+RRdZAqmWB033q1dDnz7QrBmMGWNzL4WhWzdrOFfNbsO5cy67iouLKS4urtQ5yh1wJyKdgYGq2jN4fROgqnpPgn2LgNdU9aUyzjUEeD3+fRHpDxyqqleLyNvAAFWdICLVgaWq2jjBucodcPfNNzb/UVGRlS4uvBD69w/nF3oqFiywdojeveHuu8NNbKqw++5WWmvVKrw4nHOZla0BdxOBNiLSSkRqAecAo5PFERNQcxHZIXjeEOgGzBWR6iLSKNheE+gFfBocNhprGAc4Exhbgc/zC40bwx/+ANOm2QjmlSuhY0dr1xg2zEY4R8WkSXDEEfD739v8TWGXfkS8+sk5Z1KawkNEegIPYonlSVW9W0RuByaq6usi0hF4GdgZ+BFYpqrtROR4YBCwFUsgD6nqkyJSF/gQq/qqDrwHXKeqKiK1gWHAIcAq4BxV/SpBTGlN4bFxI4webaWMjz+2tRsuvhg6dw6viuWNN6y08/jjcOqp4cSQyODBMH8+PPxw2JE45zIlnRJFlZ7r6euvrWRRVGRJ4sIL4YILrH0gV/7zHxgwwMZKdO6cu+um4uOP4be/halTw47EOZcpnijSpGojn4uKYORIG3B20UXWVlC7dkYukfCat90GL7xg40DatMnOdSrjp59sgsClS2HHHcOOxjmXCT4pYJpEbNzA449DSYnNzProozZ19+9/D1OmZHZ50J9+spLL2LHW/TWKSQJsLqlDDrEGbedc1eWJIk69erZK3Nix8Mkn0KgR9O0LBx8MDzwAK1ZU7vxr11rPpnXrbLGh3XbLTNzZ4g3azjlPFEnsuaeN+p4/H+6/HyZPhn32scTx2ms2wK8iFi+2nk3772+9sOrWzUrYGeUzyTrnvI2igtautXaFoiIb93Deedaesf/+yY+bPh169bLuutddlz+D2FauhL33toGA1auHHY1zrrK8jSIHGjSAyy6zX9njxtmX5/HHw+GH2xKia9Zsf8y778IJJ8A//mHzN+VLkgDYdVcbePfpp+Xv65wrTJ4oKmG//WwE9aJFVkU1dqyN+j73XEsOW7fC0KFW6hg1Cs4+O+yI0+PVT85VbV71lGGrVsHzz1vV1JIlNlfTm29C27ZhR5a+J5+00tMzz4QdiXOusnwcRcTMng1NmljPqXw2Z4711FqwIOxInHOV5YnCZcXWrdaNd+bM3I5ad85lnjdmu6yoVs0GJHo7hXNVkycKlxIfeOdc1eWJwqXESxTOVV3eRuFSsmGDjalYsSI/RpQ75xLzNgqXNXXqQLt2MHFi2JE453LNE4VLmVc/OVc1eaJwKfMGbeeqJm+jcClbuhQOPNDaKcJe09s5lx5vo3BZtfvuNini3LlhR+KcyyVPFK5CvPrJuarHE4WrEG/Qdq7q8UThKsRLFM5VPZ4oXIUccAAsX175tcOdc/nDE4WrkOrVbTW/8ePDjsQ5lyueKFyFefWTc1WLJwpXYZ4onKtafMCdq7B162xMxapVULt22NE45yrCB9y5nNhxR9h3X5gyJexInHO54InCpaVbNx9P4VxV4YnCpaVrV2+ncK6q8DYKl5ZFi+Cww2DZMpAK1XY658LkbRQuZ/bYA2rWhPnzw47EOZdtnihcWkS8m6xzVYUnCpc2TxTOVQ2eKFzafCZZ56qGlBKFiPQUkTkiMk9Ebkzw/pEiMllENolI35jtLUVkkohMEZGZInJFgmNHi8iMmNcDRKQkOGaKiPRM98O57GrfHr76CtasCTsS51w2lZsoRKQa8DDQAzgA6Cci+8XtthDoDzwbt30J0EVVOwCHAzeJSNOYc58GfJfgsoNVtUPweDvlT+NyqmZN6/nkEwQ6V9hSKVF0Aj5X1YWqugkYDvSJ3UFVF6nqp4DGbd8cHANQB/i5S5aI1AOuBe5McE3vcJknvPrJOfjwQ3jxxbCjyJ5UEkVzYHHM65JgW0pEpIWITMdKHfeo6rLgrTuAfwAbEhx2pYhME5EnRKRBqtdyuecN2s7BnXdC//4wZEjYkWRHKoki0a/7lEe6qWqJqrYH2gAXishuItIeaKOqo4Pzx17jEWBvVT0YWAYMTvVaLve6dIGJE2HTpvL3da4QffMNfPKJVcEOGABPPhl2RJlXI4V9SoCWMa9bYG0PFaKqy0RkFnAk0BjoICLzgZpAYxEZq6rHqmrs2mmPA6+Vdc6BAwf+/Lx79+507969omG5Stp5Z2jVCmbMgEMPDTsa53Jv5Eg4+WTr3DF2LBx7rG2/5JJw4ypVXFxMcXFxpc5R7hQeIlIdmAscBywFPgH6qepnCfYtAl5X1VHB6+bAKlX9UUQaAh8DfVV1VswxrYDXVPWg4HXT0uopEbkWOExVz01wLZ/CIyKuuMKWSL366rAjcS73jj4arr8eeve2159/bsliwAC49NJwY0skK1N4qOoW4CrgHWAWMFxVPxOR20WkV3DhjiKyGDgDeExEZgaHtwUmiMhUYBxwb2ySKMO9IjJDRKYBR2MN3i7CvEHbVVUlJTBzJvTosW3bPvtYyeL22+GJJ8KLLZN8UkBXaV98AcccA4sXl79voVO1pNmlC1Tz4awF7/77LVEkasQuLVn85S9w2WW5j60sPimgC8Xee8NPP9mMslXdlClwxBFw4onw9ddhR+OybfhwOPvsxO/tsw+MGwd33AGPP57buDLNE4WrNBGvfipVVAR//rPVW3foAKNGhR2Ry5YFC+xR2nidSJs2Vg11xx3wn//kLrZMS6XXk3PlKh1Pcc45YUcSnh9/tF+YkydbT7ATToDzzoM334QHH4T69cOO0GXSiBFw+uk2Q0EypcmiNKFcfnn2Y8s0L1G4jPCBd/DKK1aKaNXKXnfuDFOnWrvFIYdYX3tXOJJVO8UrTRZ33gn//nd248oGTxQuIzp0gLlz4fvvw44kPEOGwEUX/XLbjjva9rvuglNOsS+KLVvCic9lzpw5NtDuyCNTP6ZNG2uz+Nvf8i9ZeKJwGVG7tv1qnjAh7EjCsWiRVTmdemri9884w94fN87aL776KqfhuQwbMQLOOguqV6/YcXvvvS1ZPPZYdmLLBk8ULmO6dq261U9Dh1r7TJ06Ze/TogW8+64lk8MOg2fj51p2eUHVqp3SbY8rTRZ33ZU/ycIThcuYbt2qZs+nrVutt9PFF5e/b7Vq8Mc/wjvv2K/Kc8/19TzyzYwZsGEDHH54+ufYe29rs7jrLnj00czFli2eKFzGdO0KH39c9ergP/jAejR16JD6MYccApMmQcOGcPDBNk21yw8jRlgjtlRyMYTSksXdd0c/WXiicBmz227QuDHMKm+SlgJTWpqo6BdH3brwr3/Bww/bF8+tt/osvFFX2WqneHvttS1ZPPJIZs6ZDZ4oXEZVteqntWth9Gj4f/8v/XP06gXTptmja1eYNy9z8bnMmjgRatSwUmCmlCaLe++NbrLwROEyqqqNpxgxAo4/3kpTldGkCbz+Olx4od3Dxx+3X68uWkaMsNJEZaud4u21l7VZ3HuvlTKjxicFdBk1e7aNF/jyy7AjyY3OnW3Kjl//OnPnnD3bGrn33NMSxq67Zu7cLn1bt0LLltYRYf/9s3ONBQtsgs0//QmuvDI71/BJAV3o9tvPevEsW1b+vvlu9mwbPxE7xXQm7L+/jUdp08YWw3nnncye36Xnv/+FXXbJXpIA+3Ewbhzcd1+0ShaeKFxGVatmU2xXheqnoiJbJ7lGFmZMq13bviyGDrWV0q691uaScuEp7e2UbVFMFp4oXMZVhZlkN22CYcO2n7Ij044/3hq5Fy+GTp3g00+zez2X2ObN8OKLuUkUYMmiuNiSxcMP5+aayXiicBlXFRq033zTqob23Tf712rUyL6krr3W6q//+U9v6M614mJrn2jTJnfXbN3arjtoUPjJwhOFy7jDDrNVvzZsCDuS7El1JHamiFjpZfx4eO45OPnkqtEOFBUVmSk2k1q3tmqoQYPgoYdyf/1SnihcxtWtCwccYCOPC9GyZTYa+8wzc3/tNm3g//7PkvEhh9gYDpddP/0EL79skwCGoTRZDB4cXrLwROGyopCrn555Bk47zaYQD0PNmvDXv8LIkXDNNfCb38APP4QTS1Xw7rvQtq1VPYUlNln885+5v74nCpcVhTqTrKqtL5HLaqeydOtmDd3r19s8U5Mnhx1RYcrklB2VUdpm8cADuU8WPuDOZcWSJdCuHaxcmflRrGH6+GO44AJbpClKn2v4cLj6arjuOhusVdF1ElxiGzZAs2bw2WfQtGnY0ZiFC61TwzXX2KOifMCdi4xmzWCnnewLtZAUFVmjcpSSBNgv3kmT4K234LjjbCCgq7y33rLSWlSSBNhSu+PG2TrsDz6Ym2t6onBZU2jjKdavt26qF1wQdiSJtWxp8wX17AkdO9oAMVc5Ual2ihebLB54IPvX80ThsqbQGrRHjbJR582bhx1J2apXh5tusnEef/mLJbXvvgs7qvz0/fcwZgz07Rt2JIm1amVtFg89lP1k4YnCZU2hJYohQ7I/EjtTOnaEKVNsadaDDy6skl2uvPaa/T/cqFHYkZStZUsrWTz0ENx/f/au44nCZc2BB8LSpdagne/mz7fpM045JexIUlevHvz73/YF0rcvDBhgU1G41ES12ileabJ4+OHsJQtPFC5rqle3dYXHjw87ksp76ilbnKh27bAjqbg+fWDqVOuxdcQRVWcK+MpYs8aqdfr0CTuS1MQmi8GDM39+TxQuqwphPMWWLZYo8qXaKZHdd7cePP362RoaTz3l80Ul88orcOyx0KBB2JGkrmVLS26PPJL5ZOGJwmVVISyN+v77toJd+/ZhR1I51apZv/uxY23uoMsuCzui6MqXaqd4e+xhJYtHHrF/40zxAXcuq777zsZUrF4NtWqFHU16+vWzKptsrTgWhg0b4Fe/smlAOnUKO5poWbHC5tRassTaefLR4sU2KO+3v4Xrr//lez7gzkXOTjvZH92UKWFHkp7Vq7dV2RSSOnXghhvgb38LO5LoGTUKTjopf5MEbCtZPPoo/OMflT+fJwqXdflc/fT88/alscsuYUeSeZdcAhMnwvTpYUcSLSNG5Ge1U7w99rA2i8ceq3yy8EThsi6fx1Pk09iJiqpTx6olvFSxzZIlNtFiz55hR5IZLVpsSxb33Zf+ebKw2q9zv9S1q01Wpxq9OZKSmT7d6quPOy7sSLLniivg3ntt0ru2bcOOJnwvvgi9e8MOO4QdSeaUJotjjkn/HF6icFnXqpWNqViwIOxIKqaoCC68sLBnYq1f33pC3XVX2JFEQ6FUO8Vr0cLaLP7zn/SOTylRiEhPEZkjIvNE5MYE7x8pIpNFZJOI9I3Z3lJEJonIFBGZKSJXJDh2tIjMiHndUETeEZG5IjJGRPKoJ7NLRCT/qp82boRnn7VEUeiuvNLmhqrqA/G++grmzYPjjw87kuxo0QI++ii9Y8tNFCJSDXgY6AEcAPQTkf3idlsI9Aeejdu+BOiiqh2Aw4GbROTnCXtF5DQgfsqym4D3VPVXwFjg5tQ/jouqfJtJ9rXXbAqSvfYKO5Lsa9AAfvc7uPvusCMJ1wsvwOmn2wqChapJk/SOS6VE0Qn4XFUXquomYDjwi4HtqrpIVT8FNG775uAYgDrAzzXUIlIPuBa4M+56fYChwfOhwKkpfhYXYflWoigqisYqdrlyzTXw0ktVex2L4cPh7LPDjiKaUkkUzYHFMa9Lgm0pEZEWIjIdK3Xco6rLgrfuAP4BbIg7pLGqLgcI9t0t1Wu56Dr4YJtYb82asCMp39df2/xUp58ediS506gRXHqpNWxXRfPm2QSWRx8ddiTRlEqiSNRPJeUh0apaoqrtgTbAhSKym4i0B9qo6ujg/HnUF8alo2ZNm/p6woSwIynf00/DmWdC3bphR5Jb110Hzz1nX5hVzYgR9m9eyB0XKiOV7rElQMuY1y2wtocKUdVlIjILOBJoDHQQkflATaCxiIxV1WOB5SLSRFWXB+0Z35R1zoEDB/78vHv37nTv3r2iYbkcKq1+6tEj7EjKpmpjJ4YNCzuS3GvSxBY6GjQoM6N588nw4en3CIq64uJiiouLK3WOcud6EpHqwFzgOGAp8AnQT1U/S7BvEfC6qo4KXjcHVqnqjyLSEPgY6Kuqs2KOaQW8pqoHBa/vAVar6j1BD6uGqnpTgmv5XE955o03bFbL998PO5KyffQRXH45zJqVX2M+MqWkBA46yNY6362KVPp++imcfLL1eqpWBQYMZGWuJ1XdAlwFvAPMAoar6mcicruI9Aou3FFEFgNnAI+JyMzg8LbABBGZCowD7o1NEmW4BzhBROYCxwNVvC9G4ejSxaaMiPLiOUOGWCN2VUwSYF0ozzorN+swR0VpI3ZVSBLp8tljXU4dcIBV63ToEHYk21u3zub0nzMn/W6EheCrr+DQQ+GLL6Bhw7CjyS5V2Gcfa6M49NCwo8kNnz3WRV6UFzJ68UU46qiqnSQAWre2aSweeijsSLJv8mT7bxR/uESJJwqXU1GeSbaqjZ1I5uabLVGsWxd2JNnZTkd6AAASfUlEQVRVOmVHVa1qTJUnCpdTUS1RzJsHn39ujZoO9t0XTjjB1jMoVFu3Fu7cTpnmicLl1D77wI8/2gpcUVJUBOefX9jTN1TULbdYL7X168OOJDvGj4cdd7SpWlxynihcTolEb96nzZth6NDCXXciXQceaP9Wjz8ediTZ4aWJ1HmicDkXteqnMWOst9P++4cdSfTcdpsteLNxY9iRZNaWLdZ5wed2So0nCpdzUWvQ9kbssnXoAO3b2z0qJB98AM2aWVuMK5+Po3A59+OPNgnd8uW2cE6YVqywdpOFC226bbe98eOhXz9r7C+UNpzLL4c2beCGG8KOJPd8HIXLCzvsYL9SP/kk7EhscaLevT1JJNOli32pPhu/2kye2rTJplQ/66ywI8kfnihcKKJQ/VQ6AaA3Ypfvttvg73+3uv189957VuXUunXYkeQPTxQuFFFYyGjKFPj+e1+DIBVHHw2NG9sqcPnOFyiqOG+jcKFYvhx+9StYvTq8ydiuvBKaNoU//zmc6+ebMWPg+uthxoz8nUDvxx9h991tduBmzcKOJhzeRuHyRpMmNo317NnhXH/DBvtl2b9/ONfPRyeeCHXqwCuvhB1J+t5+21ZbrKpJIl2eKFxowhxP8corNltoy5bl7+uMiLVV3Hmnte/ko+HDfZBdOjxRuNCE2U7hYyfSc8op1qD91lthR1JxP/xgcVeltdAzxROFC01YPZ8WLrSG7FNPzf218121anDrrXDHHflXqnj9devqu+uuYUeSfzxRuNC0bQurVlnDdi4NHWrVDzvskNvrForTT4dvv4WxY8OOpGK82il9nihcaKpVs194uax+2rrVqp187ET6qle3UsWdd4YdSerWrrW12r0UmR5PFC5Uua5++uAD2GknX9Gssvr1syq8jz4KO5LUvPoqHHMM7Lxz2JHkJ08ULlS57vk0ZIg1YvuKZpVTo4atgpcvpQqvdqocH3DnQvXDDzbid9Wq7LcZrF0LrVrBF194g2Ym/PSTzQE1ahQcdljY0ZRt5UrYe2/4+uvwJ6GMAh9w5/JOvXq2DsSkSdm/1vDhcPzxniQypVYtm331b38LO5LkXnoJevTwJFEZnihc6HJV/eRjJzLvkktsFuAZM8KOpGy+kl3leaJwoctFg/asWbZO94knZvc6VU2dOjb/U1RLFUuX2piZk04KO5L85onCha50De1sNjkVFdm8TjVqZO8aVdUVV8C4cTBnTtiRbG/kSOjVyxKaS58nChe6Fi2gbl2YNy8759+0CYYNgwsvzM75q7r69eGaa2y9iqjxaqfM8EThIiGb1U9vvmkL1fj6yNlz1VV2n7/8MuxItlm0CD77DE44IexI8p8nChcJ2WzQLh074bKnQQP43e/gnnvCjmSbF16Avn2td5arHB9H4SJh6lQ491z7BZhJy5bZnFKLF3v3yGxbtcpKbVOnRmP69o4d4e67rUu028bHUbi81a6dDYhatSqz5x02DE47zZNELjRqBJdeCvfdF3YkNqhy8WLo3j3sSAqDJwoXCTVqQKdOMH585s6p6tVOuXbddfDss1aSC9OIEXDmmd7LLVM8UbjIyHSD9oQJtshOt26ZO6dLrkkTOP98+Mc/wo1j+HA4++xwYygknihcZGR6xTufADAcf/qT3fuVK8O5/qxZtl6G/0DIHG/MdpGxdi00bw6rV1e+p8oPP9j4jFmzoFmzzMTnUveb39icWmHMLvuXv9i//6BBub92PvDGbJfXGjSwWT6nTav8uUaNsi63niTCceON8OijsGZNbq+r6tVO2eCJwkVKpsZT+ASA4dpzTzjlFHjoodxed+pU2Lw52tOe56OUEoWI9BSROSIyT0RuTPD+kSIyWUQ2iUjfmO0tRWSSiEwRkZkickXMe2+JyNRg+yMiVpMsIgNEpCQ4ZoqI9MzEB3X5IRPtFF9+aVVOp5ySmZhcem65xRLFunW5u2bplB3eLpVZ5SYKEakGPAz0AA4A+onIfnG7LQT6A8/GbV8CdFHVDsDhwE0i0jR470xVPURV2wGNgTNjjhusqh2Cx9sV/lQub5Umiso0Pz31lA3e8xG54dp3XzjuOKuCyoXSaief2ynzUilRdAI+V9WFqroJGA70id1BVRep6qeAxm3fHBwDUAeQmPe+BxCRmkCtuGP990AV1bq1/ferr9I7fssWSxQXXZShgFyl3HILDB4M69dn/1off2yTS7Zrl/1rVTWpJIrmwOKY1yXBtpSISAsRmY6VOu5R1WUx770NLAO+A0bGHHaliEwTkSdEpEGq13L5T6Ry1U/vv299+du3z2xcLj3t2lm70xNPZP9aXu2UPamMW0x021OuGFDVEqB9UOX0qoiMVNUVwXs9RaQWVmV1LPA+8AjwV1VVEbkTGAxckujcAwcO/Pl59+7d6e7j9QtC6cC7886r+LE+Ejt6br0V+vSxdStq187ONbZssUkAx47NzvnzWXFxMcXFxZU6R7njKESkMzBQVXsGr28CVFW3mydSRIqA11T1pTLONQR4Pf59EbkA6KiqV8dtbxWc76AE5/JxFAVqwgS4/HKYPr1ix61ebb1tvvoKGjbMSmguTSefvC1ZZENxMfzhD5npWl3osjWOYiLQRkRaBb/+zwFGJ4sjJqDmIrJD8Lwh0A2YKyL1Shu1RaQGcDIwJ3jdNOZcfYFPK/B5XAE45BDrubR2bcWOe+45+0LyJBE9f/6zzeS6aVP5+6bDG7Gzq9xEoapbgKuAd4BZwHBV/UxEbheRXgAi0lFEFgNnAI+JyMzg8LbABBGZCowD7lXVWUA9YLSITAOmAsuBx4Jj7hWRGcF7RwPXZurDuvxQqxZ06GAli4rwsRPR1aUL7LWXTRiYaZs22QBLH2SXPT6Fh4ukm2+2hHH77antP22aVW3Mnw/Vq2c3NpeeceOs6umzzzL7bzRmDAwYYL2eXPl8Cg9XMCo6k2xRka2J7Ukiurp3h8aN4cUXM3ten7Ij+7xE4SJp1SprmF69uvw1BTZutAkAP/nEjnHR9fbbNrvs9OlQLQM/UzduhN13h5kzbUJJVz4vUbiC0aiR/eHPnFn+vq+9Zv31PUlEX48esMMO8OqrmTnfmDH2b+9JIrs8UbjISrX6ycdO5A8RuO02m348ExUCXu2UG54oXGSlMpNsSYk1Yvbtm3w/Fx2nnGI9ld6u5Cxu69fDG2/AGWdkJi5XNk8ULrJSmcrj6adtbeS6dXMTk6u8atVstPYdd1SuVPHGG3D44dZA7rLLE4WLrH33tZXKSkoSv6/qYyfy1RlnWEeFcePSP4cPsssdTxQuskSs+qmsdoqPPrKxFp065TYuV3nVq9vMsukulfrdd/Dee3DaaZmNyyXmicJFWrLqp9JGbJ8tND/162fzcqUzU/Do0XDUUT5dS654onCRVlaJYt06ePnl9GaYddFQs6aNwE+nVOHVTrnlA+5cpG3YALvuCt98A/Xqbdv+5JM2fuKVV8KLzVXexo3Qpo0l/Y4dUzumdJbgkhLYccfsxleIfMCdKzh16sBBB9mo61jeiF0YateGG26oWKnipZfghBM8SeSSJwoXefHVT3PnwhdfwEknhReTy5xLL7WZgmfMSG3/0pXsXO54onCRF9+gXVQE559vddwu/9WpA9dfD3//e/n7Ll8OEyfauiMud7yNwkXesmXQtq1NFLh1K7RsaV0j998/7Mhcpnz/va1X8eGHsN9+Ze/3r39Z6TIb61pUFd5G4QpS06awyy62jsGYMdCqlSeJQlO/PlxzDdx1V/L9vNopHOVM4OxcNJRWP40Z443Yheqqq2DvvW3xqb322v79khKYNQtOPDH3sVV1XqJweaFbN5ua+v33fbbQQtWgAfz2t7a2diIvvGCrGNaundu4nCcKlye6doU334TevWGnncKOxmXLH/5g618vXrz9ez7ILjyeKFxeOOAAa6fwaqfC1qgRXHIJ3HffL7fPn2/TfRx7bChhVXne68nljZUrbZS2K2zLlllnhdmzrSMDWCP34sXwyCPhxlYIvNeTK2ieJKqGpk1tnMygQdu2+Up24fIShXMuckpKbOqWefNgxQo4/nhYtMimJ3eVk06JwhOFcy6SrrgCdtsNatSAtWvh/vvDjqgweKJwzhWMBQvgsMOsl9tzz0HnzmFHVBi8jcI5VzD23BN69bJpWw4/POxoqjYvUTjnImvZMmunOOqosCMpHF715JxzLimvenLOOZdxniicc84l5YnCOedcUp4onHPOJeWJwjnnXFKeKJxzziXlicI551xSKSUKEekpInNEZJ6I3Jjg/SNFZLKIbBKRvjHbW4rIJBGZIiIzReSKmPfeEpGpwfZHRESC7Q1F5B0RmSsiY0SkQSY+qHPOufSUmyhEpBrwMNADOADoJyL7xe22EOgPPBu3fQnQRVU7AIcDN4lIMMM8Z6rqIaraDmgMnBlsvwl4T1V/BYwFbq74x4qO4uLisENIiceZOfkQI3icmZYvcaYjlRJFJ+BzVV2oqpuA4UCf2B1UdZGqfgpo3PbNwTEAdQCJee97ABGpCdSKObYPMDR4PhQ4tUKfKGLy5X8ejzNz8iFG8DgzLV/iTEcqiaI5ELuCbUmwLSUi0kJEpmOljntUdVnMe28Dy4DvgJHB5saquhwg2He3VK/lnHMu81JJFInmBEl5kiVVLVHV9kAb4EIR2S3mvZ7A7kBtwFfDdc65KFLVpA+gM/B2zOubgBvL2LcI6JvkXEMSvQ9cAPwzeP4Z0CR43hT4rIxzqT/84Q9/+KPij/K+9+MfNSjfRKCNiLQClgLnAP2S7P9zCUREmgOrVPVHEWkIdAMGiUg9YEdVXSYiNYCTgQ+Dw0YDFwL3YA3krya6SEVnP3TOOZeelKYZF5GewINYVdWTqnq3iNwOTFTV10WkI/AysDPwI7BMVduJyPHAIGArlkAeUtUnRaQx8DrWiF0d6910rapuFZFdgBeAPYBFWO+oNZn92M4551KVt+tROOecy428GJktIk+KyHIRmRGzLXID88qIc4CIlASDDqcEpbMwY2whImNFZHYw2PHqYHuk7meCOH8fbI/a/awtIhNiBo8OCLa3FpGPg/v5fFDFGsU4i0RkfrB9iogcFGacQUzVglhGB68jdS9LBXFOjYnzqQjey69EZHoQ0yfBtgr/redFosAayXvEbYviwLxEcQIMVtUOwePtXAcVZzNwnaruD3QBrgwGUEbtfsbHeVXMQM/I3E9V3Qgco6qHAAcDJ4nI4Vgb26Dgfq4BLgkxzGRxAvwxGPzaQVVnlH2WnLkGmB3zOlL3MsY1wKyY1wpcH7F7uRXoHsTUKdhW4b/1vEgUqvoR8G3c5sgNzCsjTkjcxTgUqrpMVacFz7/Hepm1IGL3s4w4S8fvROZ+Aqjq+uBpbaAG9oVxDDAq2D4UOC2E0H4hQZxbg9eRuZ8i0gLr3PJEzOZjidi9LCNOiN53qrB9TBX+W4/ah6qIfBqYd6WITBORJ8Ku0oklIq2xX5cfY12SI3k/Y+KcEGyK1P0srYLABo++C3wJrFHV0i/iEqBZWPGVio9TVScGb90Z3M9BwUwJYbof+BOWbBGRRsC3UbuXxMUZI0r3Eiy+MSIyUUQuDbZV+G89nxNFvngE2FtVD8b+QAeHHA8AIlIfGw1/TfCLPZK9GhLEGbn7qapbgyqdFtiUN20T7ZbbqBIEEBeniOwP3KSqbYHDgEbAdpN+5oqI/BpYHpQkS0s5wvYlnlDvZRlxQoTuZYyuqtoRK/1cKSJHksb9y+dEsVxEmgCITTT4TcjxJKSqK3Rb17LHsf+JQhU0Bo4Ehqlq6TiVyN3PRHFG8X6WUtXvgA+wQao7i02oCfbFvCS0wOLExNkz5pflJqyNrVOyY7OsG9BbROYDz2NVTg8ADSJ2L7eLU0Sejti9JIhlWfDfFcArWEwV/lvPp0QR/8uidGAeJBmYF4JfxCnbZssF6At8mvOItjcEmK2qD8Zsi+L93C7OqN1PEdm1tPpLROoAx2MNsePYNiNy6PezjDjnlN5PERGsrjq0+6mqt6hqS1XdCxvYO1ZVzyNi97KMOC+I0r0M4qgblMgRG+R8IjCTdP7WKzqUO4wH8Bz2K2IjNgjvIqAh8B4wF6sX3jmicT4NzACmYRm9ScgxdgO2BPFMBaYAPYFdonQ/k8QZtfvZLohtWhDXrcH2PbE2lXnACKBmRON8H5gebHsaqBtmnDHxHg2MjuK9TBJnpO5lcN9K/35mYlVjpPO37gPunHPOJZVPVU/OOedC4InCOedcUp4onHPOJeWJwjnnXFKeKJxzziXlicI551xSniicc84l5YnCOedcUv8fxhO6X1RG03QAAAAASUVORK5CYII=",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x109343c18>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.plot(params, test_scores)\n",
    "plt.title(\"n_estimator vs CV Error\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Adaboost+Ridge在这里，25个小分类器的情况下，也是达到了接近0.132的效果。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "同理，这里，你也可以不必输入Base_estimator，使用Adaboost自带的DT。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 108,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "params = [10, 15, 20, 25, 30, 35, 40, 45, 50]\n",
    "test_scores = []\n",
    "for param in params:\n",
    "    clf = BaggingRegressor(n_estimators=param)\n",
    "    test_score = np.sqrt(-cross_val_score(clf, X_train, y_train, cv=10, scoring='neg_mean_squared_error'))\n",
    "    test_scores.append(np.mean(test_score))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 109,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAEKCAYAAAASByJ7AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xu8VHW9//HXGxATxfstQVFTFMkriJgh29Kk8opacn4V55THTuaxMj2a+UhM81Jmph6zU+ZdwURL09I0t/d7oggomHIHzRItNdjA5/fHd20YN7P3zL7MXjOz38/HYz/2zFpr1nxmwV6ftT7fyygiMDMz65V3AGZmVh2cEMzMDHBCMDOzjBOCmZkBTghmZpZxQjAzM8AJwczMMk4IVvUk3S3pi3nHYVbvnBCsqkg6S9J1hcsi4jMRcX0F3muQpJWSqvLvQNK/SXpa0j8kLZB0l6T9JB0r6bUi2/eW9LqkzxRZN17ScknvZD//yH5v2T2fxmpBVf4hmHUTAZH9bv+Lpd5dG84H9n0ycDFwLrA5sA1wBXAYcDuwgaT9W7zs08BK4A+t7PaxiFg/++mf/V5c5L3X+Fwd+azVmmitdf4Hs5IkvSbp25Kel/SWpJsl9S3jdYdIei57zSOSdi1Yd5qk+dlV6gxJB0g6GDgD+Hx2Bftctu0Dkr6cPR6f7evibL+vSNo3Wz5X0mJJXyp4n89I+rOktyXNkXRWQYgPZr+XZHHso+RMSbOzfV0jaf1sX813FF+WNAe4v8hnnl54hZ5dtf9V0h6S1pZ0vaQ3s9iflLRZkX2sD5wNnBARv42I9yNiRUTcFRGnRcRS4NfAl1q89IvAjRGxstS/TZH3fE3S/0h6HvhnFnfLZb0kDcn+Pd6SNFXSoQX7uFrSFdmdzD+AhvbGYTmLCP/4p80f4DXgCWALYENgOnB8idfsBbwODCddgX8x289awGBgLrBFtu02wHbZ47OA61rs6wHgy9nj8cAy0slQwDnAHOCybN8HAe8A/bLt9weGZo8/CiwCDsueDwJWACp4ry8DM7N1/YDJzfFky1YC1wDrAGsX+dxnAjcUPP8sMD17fDzwW2DtLPY9gfWK7OPg7DP2auP4fgxY0hwDsD7wHrBrK9uPBx4q8W/8Z2Crgn1+YBnQB5gFnJY9PiA71jtm218NvAWMzJ73zfv/rn/a9+M7BCvXTyPi9YhYAtwJ7FFi++OAKyPimUiuB5YCI0kn4b7ARyX1iYi5EbFGTbwNr0XEdZHOOpOAgcDZEdEUEX8knUx3AIiIhyJiWvb4RWAiMLrF/gpLRv8GXBwRcyLiPeA7wLEF5Y8Azop01b60SGw3A4dJ+lD2fBxwU/a4CdgEGJwdk+ci4p9F9rEJ8Ga0caUfEY+REu6R2aLPAy9HxNTWXgPsK+nv2c9bkma1WP/TiFjY4nMVLhsJrBsRF0bE8oh4APhd9hmb/TYinshiXNZGLFaFnBCsXK8XPH4PWK/E9oOAbxeegEgn7q0i4i/AN4EJwOuSbmpn42ZhLO8DRMSbLZatB5CVgf4k6Q1JS4CvApu2se+tSHcczeaQroa3KFg2v7UXZ59tOnCopHVINf/mhHA9cA8wMSuXXdBKbf5vwKZl1OCvZ3XZ6AvAtSW2fzwiNs5+NoqIHVusL/a5CpdtBcxrsX4OMKDgecv1VkOcEKxS5gE/aHECWi8iJgFExMSIGEVKHAAXZr+7ej72G4HfAAMiYkPg56y+Iyj2XgsLYiJ73MQHk1CpGCeS7jQOB6ZFxKsA2VX1ORExlFTyOZQ12wEAHgf+BRxR4n2uAz4paSSwD6sTT0cV+1yFyxYCW7dYvw2woMQ+rEY4IVil/AL4L0kjACStmzXwritpcNaI3JdU3nmfVEaCdOLdVlJ7ev60te16wFsR0ZTF8m8F6/5KahP4SMGym4FvSdpW0nrAD4CJBeWbcuKaCHwK+BoFJ2lJDZI+ml35/5OUaFa0fHFEvENqS/lfSYdLWkdSH0ljJF1QsN1c4NEs5j9GxBsl4upQb6oCTwLvZg3NfSQ1AIdk7291oKyEkP1HfEnSTEmnFVk/StKzkpokjW2xbkXWy+M5Sb8pWH5Dts8XJP2ylVtnqw7tvuqLiGeB/wQul/R3UkPt+Gz12sAFpBPyQmAzUu8iSL1nBPxN0jNlvn/L9YXPTwDOkfQ2qcF3UkGM75NO+I9mZa0RwK9IpZiHgL+QymMntfFeawaTunI+Tqq5TypYtSVwK/A2MI3UWH5DK/v4CXByFvMbpEb4r5PudgpdS7pKL1UuAhipNcchDGvjc31gWUQ0kUpgnwHeBC4HvhgRs4ptb7VHqV2ujQ3S1cxM4JOkP96ngWMj4qWCbbYh9XI4BbgjIm4rWPdORKxfZL9jIuIP2eObgAcj4ued/0hmZtYRfcrYZgQwKyLmAEiaSKqNrkoI2a0rkopll6K3qc3JIPMUqcHRzMxyUk7JaAAf7Dkwnw/2KihlbUlPSXpM0uEtV0rqQ+qj3troSqtSkr5TUHoo/Lkr79jMrP3KuUModoXfnlrhNhGxWNJ2wJ8kvdCiz/kVpHLRo+3Yp1WBiDgfOD/vOMysa5STEOaTGq2aDSS1JZQla2AjIl6T1EganfkagKTvAZtGxPGtvb6VMpSZmZUQEe3qWVZOyehpYAeleVz6AscCd7Sx/aoAJG2YvQZJm5L6Xk/Pnh9HGqI/rthOCuU9nLucn7POOiv3GOolzlqI0XE6zmr/6YiSCSEiVgAnAveSuspNjIgZks6WdEh2ch8uaR5wNHClpObh80OAZ5QmKbsfOD9W9076GWkWxyeybqlndugTmJlZlyinZESkHkE7tVh2VsHjZ1hzBCMR8TiwWyv7XKtdkZqZWUV5pHIXaWhoyDuEstRCnLUQIzjOruY481dyYFreJEW1x2hmVm0kERVoVDYzsx7ACcHMzAAnBDMzyzghmJkZ4IRgZmYZJwQzMwOcEMzMLOOEYGZmgBOCmZllnBDMzAxwQjAzs4wTgpmZAU4IZmaWcUIwMzOgRhLC7Nl5R2BmVv9qIiHcemveEZiZ1b+aSAi33JJ3BGZm9a8mEsLs2S4bmZlVWk0khCOPhF//Ou8ozMzqW00khM99zmUjM7NKq4mEMHo0zJkDr76adyRmZvWrJhJCnz4wdqx7G5mZVVJNJARw2cjMrNJqJiHsvz/Mmwd/+UvekZiZ1aeaSQh9+sBRR7m3kZlZpdRMQgA45hgnBDOzSikrIUgaI+klSTMlnVZk/ShJz0pqkjS2xboVkv4s6TlJvylYvq2kJyS9LOlmSX1KxbH//rBgAbzySjlRm5lZe5RMCJJ6AZcDBwNDgXGSdm6x2RxgPHBjkV28GxF7RcSeEXFEwfILgR9HxE7AEuArpWLp3dtlIzOzSinnDmEEMCsi5kREEzAROLxwg4iYGxEvAlHk9Wplv58AJmePrwWOLCdg9zYyM6uMchLCAGBewfP52bJyrS3pKUmPSTocQNImwFsRsbJgn1uVs7OPfxwWL4ZZs9oRgZmZlVROQih2hV/sTqA120TECOD/AZdI2i7bZ8v9lrVPl43MzCqjZEMu6ep9m4LnA4GF5b5BRCzOfr8mqRHYMyJuk7SBpF7ZXUKb+5wwYcKqxw0NDXzucw2cdBKccUa5UZiZ1bfGxkYaGxs7tQ9FtH1hLqk38DLwSWAR8BQwLiJmFNn2auB3ETE5e74h8F5ELJO0KfAYcFhEvCRpEnBbREyS9DPg+Yi4ssg+o2WMK1bA1ltDYyMMHtz+D21mVu8kERGtteEWVbJkFBErgBOBe4FpwMSImCHpbEmHZG88XNI84GjgSklTs5cPAZ6R9BxwP3BeRLyUrTsdOFnSTGBj4Kpyg3bZyMys65W8Q8hbsTsEgIcfhhNPhOefzyEoM7MqV5E7hGq1337w5pvw0kultzUzs9JqNiH06gVHH+2ykZlZV6nZhACe28jMrCvVdEL42Mfgb3+DGWv0dzIzs/aq6YTQq5fvEszMukpNJwTw3EZmZl2l5hPCyJGwZAlMn553JGZmta3mE4LLRmZmXaPmEwK4bGRm1hXqIiHssw/84x8wbVrekZiZ1a66SAgepGZm1nl1kRBgddmoyqdmMjOrWnWTEPbZB95912UjM7OOqpuEIKXeRm5cNjPrmLpJCLC6+6nLRmZm7VdXCWHECHjvPXjxxbwjMTOrPXWVECSPSTAz66i6Sgjg3kZmZh1Vdwlh+HBYtgxeeCHvSMzMakvdJYTm3kYepGZm1j51lxDAZSMzs46oy4QwbBgsXw7PP593JGZmtaMuE4IHqZmZtV9dJgRIZSMPUjMzK1/dJoS99oKVK2HKlLwjMTOrDXWbEDxIzcysfeo2IcDqdgSXjczMSqvrhLDnnulO4bnn8o7EzKz61XVCcNnIzKx8ZSUESWMkvSRppqTTiqwfJelZSU2SxhZZ31/SfEmXFiwbJ+kFSVMk3S1p4859lOI8SM3MrDwlE4KkXsDlwMHAUGCcpJ1bbDYHGA/c2MpuzgEaC/bZG7gEGB0RewBTgRPbG3w5dt8deveGZ5+txN7NzOpHOXcII4BZETEnIpqAicDhhRtExNyIeBFY4zpc0jBgc+DewsXZ7/6SBKwPLOxA/CU1l408t5GZWdvKSQgDgHkFz+dny0rKTvYXAaeyOgkQEcuBE0h3BvOBIcBV5YXcfi4bmZmV1qeMbVRkWbmn1hOAuyJiQcoNaV+S+gBfA3aPiNmSLgPOAH5QbCcTJkxY9bihoYGGhoYy3z7ZbTfo2xeeeQb23rtdLzUzqwmNjY00NjZ2ah+KEpfNkkYCEyJiTPb8dCAi4sIi214N3BkRt2XPbwA+DqwE+gNrAVcAtwEXRMSB2XajgNMi4pAi+4xSMZbjzDNh6VL40Y86vSszs6oniYgodkHfqnJKRk8DO0gaJKkvcCxwR1txND+IiC9ExLYRsT1wCnBdRJwBLACGSNok2/QgYEZ7Am8vz21kZta2kgkhIlaQegDdC0wDJkbEDElnSzoEQNJwSfOAo4ErJU0tsc9FwNnAw5KmALsD53Xuo7Rt113hQx+Cp5+u5LuYmdWukiWjvHVVyQjge9+D996Diy7qkt2ZmVWtjpSMelRCmDoVDjkEZs9O3VHNzOpVpdoQ6sZHPwr9+sFTT+UdiZlZ9elRCcFzG5mZta5HlYwAXnwRPvOZVDbq1aPSoZn1JC4ZlWHoUFhvPXjyybwjMTOrLj0uIXhuIzOz4npcyQhg2jQYMwbmzHHZyMzqk0tGZRo6FNZfH554Iu9IzMyqR49MCODeRmZmLfXIkhHA9OnwqU/B3LkuG5lZ/XHJqB122QU22ggefzzvSMzMqkOPTQgAxxzjspGZWbMeWzICmDEDDjwQ5s1z2cjM6otLRu00ZAhssgk89ljekZiZ5a9HJwRwbyMzs2Y9umQE8PLLcMABqWzUu3fF3sbMrFu5ZNQBO+0Em20Gjz6adyRmZvnq8QkBPLeRmRm4ZATAzJkwejTMn++ykZnVB5eMOmjwYNhiC3jkkbwjMTPLjxNCxr2NzKync8koM2sWjBoFCxa4bGRmtc8lo07YcUfYait4+OG8IzEzy4cTQgHPbWRmPZlLRgVeeQX22w8WLnTZyMxqm0tGnbTDDjBwIDz0UN6RmJl1PyeEFtzbyMx6KpeMWnj1VRg5MpWN+vTptrc1M+tSFSsZSRoj6SVJMyWdVmT9KEnPSmqSNLbI+v6S5ku6tGDZWpJ+LullSdMlHdmewCtl++1hm23gwQfzjsTMrHuVTAiSegGXAwcDQ4FxknZusdkcYDxwYyu7OQdobLHsu8DrEbFTROwCVM0p2HMbmVlPVM4dwghgVkTMiYgmYCJweOEGETE3Il4E1qjtSBoGbA7c22LVl4HzC/bx93bGXjHHHAO33QbLl+cdiZlZ9yknIQwA5hU8n58tK0mSgIuAUwEVLN8ge3huVmqaJGmz8kKuvO22g0GDoLEx70jMzLpPOc2mxRolym3lPQG4KyIWpNywal99gIHAwxHxbUnfAn4MfKnYTiZMmLDqcUNDAw0NDWW+fcc19zY68MCKv5WZWac1NjbS2Mmr2JK9jCSNBCZExJjs+elARMSFRba9GrgzIm7Lnt8AfBxYCfQH1gKuiIgzJP0jIvpn2w0Efh8RuxbZZ7f2Mmo2ezbsvTcsWuTeRmZWeyrVy+hpYAdJgyT1BY4F7mgrjuYHEfGFiNg2IrYHTgGui4gzstV3Sjoge3wgML09gVfattumHkcPPJB3JGZm3aNkQoiIFcCJpEbhacDEiJgh6WxJhwBIGi5pHnA0cKWkqWW89+nABElTgP8HfLujH6JSPLeRmfUkHpjWhjlzYNiwVDZaa61cQjAz6xDPZdTFBg1K8xu5bGRmPYETQgme28jMegqXjEqYOxf23BMWL3bZyMxqh0tGFbDNNjB4MNx/f96RmJlVlhNCGTy3kZn1BC4ZlWHePNhjj9TbqG/fXEMxMyuLS0YVsvXWsNNOLhuZWX1zQiiTexuZWb1zyahM8+fD7ru7bGRmtcElowoaOBCGDIH77ss7EjOzynBCaAfPbWRm9cwlo3ZYsAB23TWVjdZeO+9ozMxa55JRhQ0YAEOHumxkZvXJCaGd3NvIzOqVS0bttHBhuktYvNhlIzOrXi4ZdYOttkrtCPfem3ckZmZdywmhAzy3kZnVI5eMOmDRIthll/T7Qx/KOxozszW5ZNRNPvxh2G03l43MrL44IXSQexuZWb1xyaiDFi9OU1m4bGRm1cglo2605ZbpOxLuuSfvSMzMuoYTQid4biMzqycuGXXC4sWw886pbLTOOnlH0z4RsGIFNDXBsmUf/F1sWXt/d+a1m2wCP/85bL993kfJrHZ1pGTUp1LB9ARbbgl77ZXKRkcc0T3vuWIFvPMOLFmSft5+e/XjUs/feQeWLl194oX03Q59+8Jaa7X/d6lt+vfv2D4eegj23RduuAEOOqh7jquZ+Q6h0668Mp3AbrqpvO2XLWv7pF3qBP/uu+lEu+GGq3822KD154WP118/TbfRfOLt3buyx6YzGhth3Dg45RQ4+WRQu65zzKwjdwhOCJ30xhsweDCcf/7qk3dbJ/WmptIn7rae9+8PvXpIy8+cOXDkkak31y9+Af365R2RWe1wQsjJeefB3LnlneT79fPVbnu89x4cfzxMnw633w6DBuUdkVltqFhCkDQGuITUK+mqiLiwxfpR2frdgM9HxG0t1vcHZgC3RcRJLdbdAWwbEbu18t5VnxCssiLgJz+BH/0Ibr4ZGhryjsis+lVkHIKkXsDlwMHAUGCcpJ1bbDYHGA/c2MpuzgEai+z7SOCddsRrPZCU2hGuvx4+/3m47LKUJMysa5VTjR4BzIqIORHRBEwEDi/cICLmRsSLwBp/ppKGAZsD97ZYvi7wLeDcDsZuPcyBB8Ljj6f2hC9/Gf71r7wjMqsv5SSEAcC8gufzs2UlSRJwEXAq0PLW5Zxs3fvl7MsM0tiExx9Pva323x/mz887IrP6Uc44hGI1qHJv2E8A7oqIBSpoSZW0O7BDRJwsadtW3mOVCRMmrHrc0NBAg4vIPdq668KkSXDhhbDPPmm0+H775R2VWb4aGxtpbGzs1D5KNipLGglMiIgx2fPTgWjZsJytuxq4s7lRWdINwMeBlUB/YC3gCmAucCawLFu2OfBoRHyiyD7dqGyt+v3vYfx4OOcc+OpX847GrHpUpJeRpN7Ay8AngUXAU8C4iJhRZNurgd9FxOQi68YDw4r0MhpESiLuZWQdMmsWHH44jBoFl17q77o2gwr1MoqIFcCJpEbhacDEiJgh6WxJh2RvPFzSPOBo4EpJU9sfvlnH7LgjPPlkGiT4iU+kuaXMrP08MM3qxsqVcO65qRfSrbem9gWznsojlc2AO+6Ar3wFfvhD+I//yDsas3w4IZhlZsxIM9B+6lNw8cVpMj+znsTfmGaWGTIktSu89loa0PbGG3lHZFb9nBCsbm24YSofjRoFe+8Nzz6bd0Rm1c0lI+sRbr0Vvva1NEneF76QdzRmlec2BLM2TJ2a2hWOOCKNcu7j7wu0OuaEYFbC3/8Oxx6buqhOmpS+v9msHrlR2ayEjTeGu+9O34W9997w/PN5R2RWPZwQrMfp0yeNUfjBD1IPpFtuyTsis+rgkpH1aFOmpO9tPvbYNMq5d++8IzLrGm5DMOuAN9+Ez30uTYp3002w0UZ5R2TWeW5DMOuATTeFe+6BnXaCESNg2rS8I+p5mppSQ7/lywnBjDS1xSWXwJlnQkMD3H573hH1HDNmwG67wejRsHBh3tH0bE4IZgXGj0+9kL7xDTjrLF+1Vtott6SvQj311DTv1PDh8MADeUfVc7kNwayI11+Ho49O7Qk33ADrr593RPWlqQlOPz3did16a+oGDPDHP8IXvwgnnZTW9/Ila4e5DcGsi2yxBdx/P2y9dfpehZdfzjui+rF4ceruO306PPPM6mQAcNBB8PTTcOed6Vvw3norvzh7IicEs1b07Qv/+7/w7W+nCfJ+97u8I6p9jz6aykIHHJCO58Ybr7nN1lvDgw/CRz4Cw4Z5UsLu5JKRWRkefxyOOQb+67/gjDNcymivCLj88jTW41e/gs9+trzX/frXcMIJaRDhf/4nqF0FkJ7N4xDMKmjhQhg7FgYMgGuugf79846oNrz7Lhx/fOrOe9ttsP327Xv9yy/DUUel0tKVV0K/fpWJs964DcGsgrbaKpUyNtwQ9t0XXnkl74iq36xZMHJk6tb72GPtTwaQxoc8+WS6y9hnH5g5s+vjtMQJwawd1l4bfvnLVMYYORLOOw+WLs07qur029/CfvvB178OV1/duSv7ddeF665L+9pvv9QzybqeS0ZmHfTqq3DyyfDii/DTn5ZfF693K1bA974H11+f2gD22adr9//MM6k954gj0iSF/r7s4tyGYJaDP/whDWTbccc02nmHHfKOKD9vvgnjxqUBfTffDJtvXpn3+fvf4UtfSt1SJ02CgQMr8z61zG0IZjkYMyZ9G9v++6cy0ne/mxpSe5qnnkrdRIcPT3NDVSoZQOquescd6a5s773hvvsq9149iROCWRfo2xf+53/SF+7Mng1DhqQr155wcxsB//d/cMgh6Q7p/PO75+tJe/VKXYBvuCGNbj73XE810lkuGZlVwMMPw4knpivZyy6Dj34074gq4/33U0Pvk0+maSgGD84njgUL4POfT1OMXH+9vxoVXDIyqxqjRqURtkcfDZ/4BHzzm7BkSd5Rda3XXks9ft57LyWEvJIBpLEhDzyQ7syGDUvTX1j7OSGYVUifPunqedq0dNIcMiSN0q2Hssbvf5/aS8aPT43H662Xd0Spt9GPfwwXX5zaFn72s55RsutKZSUESWMkvSRppqTTiqwfJelZSU2SxhZZ31/SfEmXZs/XkfQ7STMkTZV0Xuc/ill12myzVGO/8074xS/SoLZavYJduRK+/3047rg0FuAb36i+6STGjk1zJv3sZ6ltoSc28HdUyYQgqRdwOXAwMBQYJ2nnFpvNAcYDN7aym3OAxhbLfhQRQ4A9gY9LOrgdcZvVnOHD04nqa1+Dww5LJ9W//jXvqMr31ltw6KFpiupnnkllsWq1447wxBPpLm3ECHjppbwjqg3l3CGMAGZFxJyIaAImAocXbhARcyPiRWCNGzRJw4DNgXsLtn8/Ih7MHi8H/gy4J7HVvV694N//PZ2g1l8fdtklNTovX553ZG177rlUmx88GP70J/jwh/OOqLR+/dII6W9+MyWvW27JO6LuMX16mgywI8pJCAOAeQXP52fLSpIk4CLgVKDojaWkDYFDgfvL2adZPdhgg1TrbmyE3/wmTdz24IN5R1XctdembzM77zz4yU9qa2SwlGZJvece+M53Uolr2bK8o+paESlhn3lmaqc6+OCO33mWkxCKncjLbao5AbgrIhYU25ek3sBNwCURMbvMfZrVjaFD06Cq730vjbwdNw7mz887qmTp0jRn03nnpcR17LF5R9Rxe+2VylyzZ6cBhPPmlXxJVVu5MpXETj01fW/EMcekb6G79lqYOzeNB+mIcoaPzAe2KXg+ECj3q7D3JbUPnAD0B9aS9I+IOCNb/3/AyxFxWVs7mTBhwqrHDQ0NNDQ0lPn2ZtVPSt1TP/1puOAC2GMPOOUU+Na30mR6eZg3L8U0YEBqAK+HrxDdaKM0VuJHP0qjm6+7Lt351IoVK+CRR2Dy5DSN+AYbpGnBb78ddtsNHnywkbvvbuTuuzvxJhHR5g/QG3gFGAT0BaYAQ1rZ9mrgqFbWjQcuLXh+LvDrMt4/zHqSV16JOPTQiB13jLjrru5///vui9hyy4gLLohYubL73787PPBAxIc/HDFhQsSKFXlH07plyyLuuSfi+OMjNt88Yo89Is45J2L69NKvzc6dJc/xhT9ljVSWNAb4KanEdFVEXCDpbODpiPidpOHA7cCGwL+AxRGxa4t9jAeGRcRJkprbJWYAy0glqMsj4ldF3jvKidGs3vz+96nmvfPOqXb/kY9U9v0i0uyhl1ySpoP45Ccr+355W7QojW7u1y993k03zTui5F//Sj25Jk9OXZV33DHdCRx1VPu+T8KznZrVmaVLUzK46KLUXfU736nMN4a9807q/bRgQRpfsPXWXf8e1Wj58jQf0qRJqRdSV0/VXa53300XAJMnp9+7754SwNixHZ/J1VNXmNWZtdeG00+HKVPSN7QNGZK+Y6Arr5GmTUs19S23hIce6jnJANI4hR/+MH2fxaGHpu997q7rz7ffhhtvTCf9rbZKgxcbGtJXhj74IJx0UvdP6+07BLMa0tgI//3faWrpyy5L4xg6Y+LEtL+LLkrTUPRkf/lLuiofMiSNKK/EdBx/+1v6JrnJk9MEiKNHp/c87LA0EWJXcsnIrAdYvjxNy/D976epGc46K/U4aY+mpjRd9x13pJPTHntUJtZa8/77aZbaxx5Lx6WzCRdg8eLUE2jy5NRj66CDUhL47Gcr23vLJSOzHqBPn3RVP21aqv0PGZL6n5c7ad6iRWkG1pkzU998J4PV1lkHrroq9e8fPTpN3NcRzWMBRo1K/z6PPJLGdCxalNpoxo2rzq68vkNvbk16AAAHF0lEQVQwq3FPPZWuanv3TjXwYcNa3/bhh9MAs69+NY1s7eVLwlZNmZLGYhx8cBpVXmpMyCuvpLuAyZPT920fdli6EzjwwHzGk7hkZNZDrVwJ11yTvr7zsMPSXDaF3SgjUsPp+een7T796bwirS1vv516Xy1cmHohDRq0el1EmjeoOQm8/joceWRKAqNH5z/FhxOCWQ+3ZElqU7j55vT7q19N/dqPOy6ViCZPhu22yzvK2hKRvmfhoovSZHlbbLE6Cbz77uoxAh/7WLpLqxZOCGYGwNSpqdviW2+lBuQRI+CKK1KN3Drm4YdXD2RrTgJ771193wfRzAnBzFaJSFexTU2p3aBaT1y1ZMWK1O5SC8fSCcHMzAB3OzUzs05wQjAzM8AJwczMMk4IZmYGOCGYmVnGCcHMzAAnBDMzyzghmJkZ4IRgZmYZJwQzMwOcEMzMLOOEYGZmgBOCmZllnBDMzAxwQjAzs4wTgpmZAU4IZmaWcUIwMzOgzIQgaYyklyTNlHRakfWjJD0rqUnS2CLr+0uaL+nSgmV7SXoh2+clnfsYZmbWWSUTgqRewOXAwcBQYJyknVtsNgcYD9zYym7OARpbLPsZcFxEDAYGSzq4HXFXncbGxrxDKEstxFkLMYLj7GqOM3/l3CGMAGZFxJyIaAImAocXbhARcyPiRSBavljSMGBz4N6CZVsC/SPiqWzRdcARHfsI1aFW/pPUQpy1ECM4zq7mOPNXTkIYAMwreD4/W1aSJAEXAacCarHP+R3Zp5mZVUY5CUFFlq1xJ9CKE4C7ImJBF+7TzMwqQBFtn4cljQQmRMSY7PnpQETEhUW2vRq4MyJuy57fAHwcWAn0B9YCrgAuBR6IiCHZdscCoyPia0X26URhZtYBEVHs4rtVfcrY5mlgB0mDgEXAscC4NrZfFUBEfGHVQmk8MCwizsievyNpRLb/L5GSxBra+4HMzKxjSpaMImIFcCKpUXgaMDEiZkg6W9IhAJKGS5oHHA1cKWlqGe99AnAVMJPUaP2Hjn4IMzPrvJIlIzMz6xmqaqSypKskvS7phYJlG0m6V9LLku6RtEEVxnhWNvDuz9nPmDxjzGIaKOlPkqZLmirppGx5tR3PlnH+d7a8qo6ppLUlPSnpuSzOs7Ll20p6IjueN0sqpwzb3TFeLenVbPmfJe2WV4yFJPXK4rkje141x7JQFudzBXFeU23HU9JsSc9nMT2VLWv333pVJQTgatIAuEKnA/dFxE7An4DvdHtUH1QsRoCLI2Kv7Kcayl/LgZMjYhdgX+Dr2YDCajueLeM8sWDgY9Uc04hYChwQEXsCewCflrQPcCHw4+x4LgG+UoUxApwSEXtmx/KF1vfSrb4BTC94XjXHsoVvkMrlzQL4dpUdz5VAQxbTiGxZu//WqyohRMQjwFstFh8OXJs9vpacB7C1EiMU70qbm4hYHBFTssf/BGYAA6m+41kszuYxKdV2TN/LHq5N6pARwAHA5Gz5tcCROYS2SpEYV2bPq+pYShoIfAb4ZcHiT1BFxxJajROq7NxJ+vdtGVO7/9ar7UMVs3lEvA7p5AFslnM8rfm6pCmSfpl3GaYlSduSrhifALao1uNZEOeT2aKqOqbNpQNgMfBH4C/AkohoPunOB7bKKz5YM8aIeDpbdW52LH8saa0cQ2z2E9KA1QCQtAnwVjUdy8wH4ixQbcczgHskPS3puGxZu//WayEh1IIrgI9ExB6kP8SLc45nFUnrAbcC38iuwKuyF0GROKvumEbEyqwcM5A0pcuQYpt1b1Qt3rxFjJJ2AU7PxvzsDWwCrDFBZXeS9Fng9ezOsPnORax5F5PrsWwlTqiy45n5WEQMJ93NfF3SKDpw/GohIbwuaQtYNQfSGznHs4aI+Gus7q71C9J/lNxljXK3AtdHxG+zxVV3PIvFWa3HFCAi3gEeBEYCGypNAAnpJLwwt8AKFMQ4puAqsYnUBjairdd2g/2AwyS9CtxMKhVdAmxQZcdyjTglXVeFx7P5DoCI+CvwG1JM7f5br8aE0PJK4Q7g37PH44HftnxBDj4QY3awm40FXuz2iIr7FTA9In5asKwaj+cacVbbMZW0aXPZStI6wIGkBtEHgGOyzXI9nq3E+FLzsZQkUh0512MZEWdExDYRsT1poOufskGsVXMsodU4v1Rtx1NSv+wOG0nrAp8CptKRv/WIqJof4CbSVcFSYC7wH8BGwH3Ay6S67YZVGON1wAvAFFJ23qIKjuV+wIospueAPwNjgI2r7Hi2FmdVHVNg1yy2KVlc382Wb0dq85gJTALWqsIY7weez5ZdB/TL+/9nQcyjgTuq7ViWiLOqjmd23Jr/fqaSSlp05G/dA9PMzAyozpKRmZnlwAnBzMwAJwQzM8s4IZiZGeCEYGZmGScEMzMDnBDMzCzjhGBmZgD8fxlXffxDV8AqAAAAAElFTkSuQmCC",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x1089ae940>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.plot(params, test_scores)\n",
    "plt.title(\"n_estimator vs CV Error\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "看来我们也许要先tune一下我们的DT模型，再做这个实验。。:P"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### XGBoost\n",
    "\n",
    "最后，我们来看看巨牛逼的XGBoost，外号：Kaggle神器\n",
    "\n",
    "这依旧是一款Boosting框架的模型，但是却做了很多的改进。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 100,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "from xgboost import XGBRegressor"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "用Sklearn自带的cross validation方法来测试模型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 101,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "params = [1,2,3,4,5,6]\n",
    "test_scores = []\n",
    "for param in params:\n",
    "    clf = XGBRegressor(max_depth=param)\n",
    "    test_score = np.sqrt(-cross_val_score(clf, X_train, y_train, cv=10, scoring='neg_mean_squared_error'))\n",
    "    test_scores.append(np.mean(test_score))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "存下所有的CV值，看看哪个alpha值更好（也就是『调参数』）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 102,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYEAAAEKCAYAAAD0Luk/AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XmUHXWZ//H3JwsxCYFgWE0IywBhESULCYpIA44ERAL8ZkYizICKBwcFRwVZFEgGB+GMCyA6IxpAFMURg6wiI6RBZEwCCXtCgkCWIWEZCWFJSEie3x/funZ1c7v7dtKdurfr8zqnTvetqlv13JtOPfV8v1X1VURgZmbl1KfoAMzMrDhOAmZmJeYkYGZWYk4CZmYl5iRgZlZiTgJmZiXmJGANS9LBkpb00LZ3krRekv+PWK/mP3BrdN1yo4ukZyUd2hPb7mIcQyRdJmmRpJWSFkj6jqRhku6UNKXKeyZJWlYtYUlqlrQq21ZlunmTfBhrCE4CZnVCUn/gHmAv4KMRsQXwQeD/gP2Ba4F/qvLWE4GfRsT6KssCOC0itshNk9rZf99a5nXyGbq0vhXPScBqlp0tnynpEUmvSfqRpG0l3ZGdYd4lacvc+v+VnaG+kp2R7p3N7y9prqQvZK/7SLpf0tc72f+7JF0r6S+SHicdGPPLd5B0o6QXJf1Z0um5ZRdK+pWkG7JYH5S0b7bsOmAkcGu27MzK24ATs7PyFyWd105cE7LPqdy8YyU9kv0+XtJsSa9m632rnY94EjACOCYingKIiJcj4t8i4k7gN8BWkj6U289Q4Cjguo6+unbiPljSEklflbQMuLravGzdz0paKOllSb+RtENuO+slnSZpAbCggzisHkWEJ081TcCzwAPA1sAOwAvAg8D7gP7A3cD5ufVPBgZly74DzM0t24d0hrsn8LVsu+pk/5cA9wJbAsOBx4DF2TJlsXwN6AvsDDwN/G22/ELgLeDYbPlXgGeAvrnPdkhuXzsB64EfAptln3E1MKqd2BYCh+Ve/xdwVvb7A8AJ2e+DgPHtbOMXwDWdfAdXAVflXp8KzOlg/RnAp9tZdjCwFrg4+zca0M68Q4GXgPdn864A7s1tZz3wu+zfZUDRf6eeujYVHoCnxpmyA+Xk3Osbge/nXn8BmN7Oe4dmB4shuXlfAuZlyWDXGvb/58pBPXv92VwSmAA812b9c4Bp2e8XAg/klgl4Hjgw99kOzS3fCVgH7JCbNxP4h3Ziuyi3ryHA68CI7HVztv9hnXy+u4CLO1nnQGBF5WAL3A98sYP1ZwBvAH8BXsl+Ts2WHZwltv659avN+zFwSe71YGANMDJ7vR44uOi/T08bNrk5yLrqhdzvq6q83hz+2sRziaSnJa0gHWSDVEVUXEc6Y78jIp6pYd/vAZbmXi/K/T4SGJ41Ff1F0ivAucC2uXX+eiVRpKPX0mybHcl/vjcrn6+KnwPHZu36xwEPRUQl1s8Ao4D5kmZK+lg72/g/UoXVroj4I/AiMEnSLsC4bN8dOT0i3h0RW2U/L8wteyki1rZZv+2895D7riPijSzW4bl18v8u1kD6FR2A9VonAB8nnV0vzvoKXqF1+/QPgFuBwyV9MCIe6GSbzwM7kqoHSGfrFUuAZyJiVAfv37HyS9Z+PwL432zWRl0JFBHzJC0CjgQmkzswR8SfgU9m+/1/wI2S3h0Rq9ps5vfARZIGVlmW91NS/8GewF0R8dLGhF7DvOfJfdeSBgPDaH3g9+OIG5QrAespm5Pa4F/JDhrfJHegkPSPwBhSv8EXgeskDepkm78CzpU0VNIIUvNTxSxgZdah+S5JfSXtI2lcbp2xko7JrmD5EqnZY2a2bDmwa5v9Ve1Q7cDPgTOAg7JY00akEyRVKqBXSd/Duirv/ykpmf1a0iglwySdK2libr3rgI8ApwA/6WKMG+LnwKckvU/SAFJ/wZ8iokfu0bBNy0nAuqLt2V5HZ3/XAYtJZ9qPkzpHAZC0I6mj+B8j4s2I+AUwG/huJ/ufmm3zWeBOclfERLo88uPAftnyF4EfAVvk3n8z8AlSRXICcGxEVA7GlwDnZ01JX96AzwtwA6lN/e6I+Etu/kTgCUkrs8/4iYhY0/bN2byPAPOB/yYljD+Rzrpn5tZbRPo+BwG3dBITwJW5ewRekzS7hvfk47oHOB+YTvr33AU4Pr9KV7Zn9UWpabSTldJZyGWkpDEtIi5ts/ygbPn7SH/g03PL1gGPkM6qFkXEMdn8a0j/YSpnRidHxKPd8aHM2pJ0IfA3EVHtOnuz0uq0T0DpLsQrgcNIbYOzJd0cEfNzqy0itVGeWWUTb0TEmHY2/5WIuKmLMZuZWTeppTloPLAwIhZlVwzcALS64zAiFkfE41QvCztqV3VzlLWS3Xj2Wpvmi5WSzik6NrPeqJarg4aTu7SOdEXA+C7sY4CkWcDbwKURkX9uyTcknU+6yeicKpeqWclExJE9tN2pPbFds0ZXy5l4tTP5rnQEjYyI8aSOuMuya5shHfT3It36Pww4uwvbNDOzblBLJbCUdCNOxQhS30BNImJ59vNZSc3AaODZiHghm7826yT+SrX3S/KVB2ZmGyAiOr3MuZZKYDawm9Lz1TcjXRrW0WVp+YdoDc3eQ3ad9AeBJ7PX22c/BRxDuoywqqJvq66X6cILLyw8hnqZ/F34u/B30fFUq04rgYhYp/S0x7touUR0nqSpwOyIuC27Iecm0vNhjpI0JSL2JT0S94fZZaJ9gG9Gy1VF12eJQcDDwOdqjtrMzLpFTY+NiPQY21Ft5l2Y+/1Bcrfk5+b/D+negWrbPKxLkZqZWbfzJZoNpKmpqegQ6oa/ixb+Llr4u+i6mu4YLpKkqPcYzczqjSSimzqGzcysl3ISMDMrMScBM7MScxIwMysxJwEzsxJzEjAzKzEnATOzEnMSMDMrMScBM7MScxIwMysxJwEzsxJzEjAzKzEnATOzEnMSMDMrMScBM7MScxIwMysxJwEzsxJzEjAzKzEnATOzEnMSMDMrMScBM7MScxIwMysxJwEzsxJzEjAzKzEnATOzEnMSMDMrsZqSgKSJkuZLWiDp7CrLD5L0kKS1ko5rs2ydpDmS5kr6TW7+zpL+JOkpSb+Q1G/jP46ZmXVFp0lAUh/gSuBwYB9gsqQ926y2CDgJuL7KJt6IiDERMToijsnNvxT4dkSMAlYAn9mQD2BmZhuulkpgPLAwIhZFxFrgBmBSfoWIWBwRjwNR5f1qZ7uHAr/Ofv8JcGxtIZuZWXepJQkMB5bkXi/N5tVqgKRZkh6QNAlA0jDglYhYn9vme7qwTTMz6wa1tMNXO5OvdsbfnpERsVzSLsA9kh4FXquy3a5s08zMukEtSWApMDL3egTwfK07iIjl2c9nJTUDoyNiuqQtJfXJqoEOt3nuuVMYMCD93tTURFNTU627NzMrhebmZpqbm7v8PkV0fAIuqS/wFHAYsAyYBUyOiHlV1r0GuC0ifp29Hgq8GRFrJG0NPAAcHRHzJf0SmB4Rv5T0H8AjEfGfVbYZF1wQTJ3a5c9mZlZakoiI9vpkW9brLAlkG5sIXE7qQ5gWEZdImgrMjojbJI0DbgKGAquB5RGxr6QPAD8E1mXv/W5EXJttcxdSJ/NWwFzgxKzjue2+493vDubPh222qeWjm5lZtyaBIkmKz38+2Gwz+M53io7GzKwx9KoksGxZsM8+8PDDsOOORUdkZlb/elUSiAjOOw9efBF+/OOiIzIzq3+9Lgm88grssQfcfz+MGlV0VGZm9a3WJNAwD5Dbaiv48pfh/POLjsTMrPdomEoA4I03YPfd4bbbYMyYggMzM6tjva4SABg8GL72tTSZmdnGa6gkAPDZz8L8+XDffUVHYmbW+BouCWy2GUydCueeC3XekmVmVvcaLgkAnHACvPoq3HFH0ZGYmTW2hkwCffvCN76R+gbWr+98fTMzq64hkwDApEkwYAD88pdFR2Jm1rga6hLRtu6+G049FebNg/79N3FgZmZ1rFdeItrWYYfBzjvD1VcXHYmZWWNq6EoAYNYsOO44WLgQBg7chIGZmdWxUlQCAOPHp+n73y86EjOzxtPwlQDAk09CU1OqBrbcctPEZWZWz0pTCQDsvTcccQR8+9tFR2Jm1lh6RSUA8NxzMHZsulJo2217Pi4zs3rW68YTqMXpp0O/fvDd7/ZwUGZmda6USeCFF1LT0Ny5MHJkDwdmZlbHStUnULHddvC5z6UHzJmZWed6VSUA/HUYyj/8AfbcswcDMzOrY6WsBCANQ/mVr3gYSjOzWvS6SgDgzTdht93g1lvTFUNmZmVT2koAYNAg+PrXPQylmVlnemUSADjlFFiwAO69t+hIzMzqV69NAh6G0sysc702CQB88pOwciXcfnvRkZiZ1adenQT69oV/+zcPQ2lm1p6akoCkiZLmS1og6ewqyw+S9JCktZKOq7J8iKSlkq7IzZuRbXOupDmStt64j1Ld0UencQZuuKEntm5m1tg6TQKS+gBXAocD+wCTJbW9DWsRcBJwfTubuQhorjJ/ckSMjogxEfFyzVF3gQQXXwwXXABr1/bEHszMGlctlcB4YGFELIqItcANwKT8ChGxOCIeB97RBStpLLAtcNcG7n+jHXoo7LILTJu2KfZmZtY4ajkIDweW5F4vzeZ1SpKAbwFnAdVuWrg6awr6ei3b2xgXXwwXXQSrVvX0nszMGke/GtapdvCu9aLL04DbI+J/Uz5ota1PRsQySYOB6ZJOjIifVdvIlClT/vp7U1MTTU1NNe6+xf77wwEHwJVXwllndfntZmZ1rbm5mebm5i6/r9PHRkg6AJgSEROz1+cAERGXVln3GuDWiJievf4Z8CFgPTAE6A/8ICLOa/O+k4CxEXFGlW12+bER7fEwlGZWFt352IjZwG6SdpK0GXA8cEtH+678EhEnRsTOEbErcCZwXUScJ6mvpGFZoP2Bo4DHa4hlo+y9Nxx5JHzrWz29JzOzxtBpEoiIdcAXSB27TwA3RMQ8SVMlHQUgaZykJcDfAf8p6bFONjsA+J2kh4E5pH6GH23E56jZlCnwgx+kAWjMzMquVz5FtDNnnAF9+sBll3XrZs3M6kYph5esVWUYyjlzYKedunXTZmZ1odSPku7MdtvBP/+zh6E0MytlJQCwYgXsvjvcdx/stVe3b97MrFCuBDoxdCiceaaHoTSzcittJQBpGMrdd4ebb4Zx43pkF2ZmhXAlUAMPQ2lmZVfqJADwmc/A00/DBtxtbWbW8EqfBDwMpZmVWemTAMDkyfD663DbbUVHYma2aTkJ4GEozay8nAQyH/84DB4Mv/hF0ZGYmW06pb5EtK0ZM+CUU2DevNRXYGbWqHyJ6AY45BD4m7/xMJRmVh6uBNp48EGYNCkNPDNo0CbbrZlZt3IlsIHGjYMPfCANQ2lm1tu5Eqhi3jz48IdTNTB06CbdtZlZt3AlsBH22guOOsrDUJpZ7+dKoB2LFsGYMWlw+u222+S7NzPbKB5ZrBt88Yvp5+WXF7J7M7MN5iTQDV58MTUNeRhKM2s07hPoBttuC6edBlOmFB2JmVnPcCXQiRUrYI890qOm9967sDDMzLrElUA3qQxDecEFRUdiZtb9XAnUwMNQmlmjcSXQjQYNSgPSn3de0ZGYmXUvJ4EafeYz8Mwz6UmjZma9hZNAjfr39zCUZtb7OAl0weTJqX/g1luLjsTMrHvUlAQkTZQ0X9ICSWdXWX6QpIckrZV0XJXlQyQtlXRFbt4YSY9m27xs4z7GptGnT8swlOvWFR2NmdnG6zQJSOoDXAkcDuwDTJa0Z5vVFgEnAde3s5mLgOY28/4DOCUi9gD2kHR4F+IuzFFHwZAhHobSzHqHWiqB8cDCiFgUEWuBG4BJ+RUiYnFEPA68o7Vc0lhgW+Cu3LztgSERMSubdR1wzIZ9hE1LgosvTvcNrFlTdDRmZhunliQwHFiSe700m9cpSQK+BZwF5K9XHZ5tp8vbrAdNTem+gR//uOhIzMw2Tr8a1ql2s0Gt18ecBtweEf+b8sGGbXNK7uE9TU1NNDU11bj7nnPxxfDxj8PJJ3sYSjMrXnNzM83NzV1+X6d3DEs6AJgSEROz1+cAERGXVln3GuDWiJievf4Z8CFgPTAE6A/8ALgCmBERe2XrHQ8cHBH/XGWbhd8x3J6///t0B/HZ7+gqNzMrVnfeMTwb2E3STpI2A44Hbulo35VfIuLEiNg5InYFzgSui4jzImI5sFLS+KzJ6J+Am2uIpa5cdFEafWzFiqIjMTPbMJ0mgYhYB3yB1LH7BHBDRMyTNFXSUQCSxklaAvwd8J+SHqth36cB04AFpI7nOzf0QxRlzz1Tk9C//3vRkZiZbRg/QG4jLV4Mo0fDE0/A9tsXHY2ZWeKRxTahf/kXWL8errii83XNzDYFJ4FNqDIM5UMPwc47Fx2NmZkfJb1JbbstfP7zHobSzBqPK4Fu8uqr6QYyD0NpZvXAlcAmtuWWcNZZafAZM7NG4UqgG61alaqBm26C/fcvOhozKzNXAgUYONDDUJpZY3ES6Gaf/jQ8+yzcc0/RkZiZdc5JoJv17w//+q8ehtLMGoOTQA84/nhYvRpu6egJS2ZmdcBJoAd4GEozaxROAj3kYx9Ll43+/OdFR2Jm1j5fItqD7r0XPvUpmD8fNtus6GjMrEx8iWgdOPhg2GMP+NGPio7EzKw6VwI9bM4cOOooWLgQBg8uOhozKwtXAnVizBj40Ifge98rOhIzs3dyJbAJPPVUSgQLFsBWWxUdjZmVgSuBOjJqFBx9tIehNLP640pgE/EwlGa2KXlksTr0pS/B22+7f8DMep6TQB166SXYc0948EHYZZeiozGz3sx9AnVom23gC1/wMJRmVj9cCWxilWEoZ8yAffYpOhoz661cCdSpLbeEr37Vw1CaWX1wJVCAyjCU06fD+PFFR2NmvZErgTo2cCBccIGHoTSz4jkJFORTn4JFi+Duu4uOxMzKzEmgIJVhKM87z8NQmllxakoCkiZKmi9pgaSzqyw/SNJDktZKOi43f6SkByXNkfSYpFNzy2Zk25ybLd+6ez5S4/jEJ+Ctt+Dmm4uOxMzKqtOOYUl9gAXAYcDzwGzg+IiYn1tnJLAFcCZwS0RMz+b3y/axVtIg4AngAxGxXNIM4MsRMbeT/fe6juG8229PVws9+ij07Vt0NGbWW3Rnx/B4YGFELIqItcANwKT8ChGxOCIeB6LN/Lez9wAMBNoGVPrmqCOPTE8Wvf76oiMxszKq5SA8HFiSe700m1cTSSMkPQIsAi6NiOW5xVdnTUFfr3V7vY0EF18MF14Ia9YUHY2ZlU2/GtapVk7U3D4TEUuB90vaHrhZ0o0R8RLwyYhYJmkwMF3SiRHxs2rbmJJ7zkJTUxNNTU217r4hfPjD6ZlCV12VHithZtZVzc3NNDc3d/l9tfQJHABMiYiJ2etzgIiIS6usew1wa6VPoMryq4Hb2i6XdBIwNiLOqPKeXt0nUDF3bmoaevppD0NpZhuvO/sEZgO7SdpJ0mbA8cAtHe07F8RwSe/Kft8KOBB4SlJfScOy+f2Bo4DHa4il1xo9OlUEV1xRdCRmViY1PTZC0kTgclLSmBYRl0iaCsyOiNskjQNuAoYCq4HlEbGvpI8A3wbWk5LD9yJiWnal0H2k5qi+wO9JVwq9I5iyVAKQhp888EAPQ2lmG8/jCTSoU05Jj5z+5jeLjsTMGpmTQINasgT228/DUJrZxnESaGBf/nK6XPTKK4uOxMwalZNAA/MwlGa2sfwo6Qa2zTZw+unpBjIzs57kSqBOrVyZBp65+25473uLjsbMGo0rgQa3xRYehtLMep4rgTq2ahXssQfceCNMmFB0NGbWSFwJ9AIehtLMepqTQJ07+eR078Dvf190JGbWGzkJ1DkPQ2lmPcl9Ag1g/frUJzBgAHzsY3DEEfD+96exCMzMqvHNYr3MqlXQ3Ay//W2a3ngDJk5MCeFv/xaGDi06QjOrJ04CvdzTT7ckhPvvT5XBEUekab/9XCWYlZ2TQImsWgX33tuSFF57rXWV4MdSm5WPk0CJ/fnPLQnhD3+A972vdZXQx5cDmPV6TgIGpCrhvvtaksKrr7ZUCR/9qKsEs97KScCqeuaZloRw332w774tVcLo0a4SzHoLJwHr1OrVrauEFSvg8MNbqoR3v7voCM1sQzkJWJc9+2xLQrj33vT00kqVMGaMqwSzRuIkYBtl9erUqVxJCn/5S+sqYdiwoiM0s444CVi3eu651lXC3nu3VAljx7pKMKs3TgLWY956q3WV8PLLqUqYODH93HrroiM0MycB22QWLYI770wJYcaMND5ypUoYNw769i06QrPycRKwQqxZkx5jUakSXngh9SEccUSqErbZpugIzcrBScDqwuLFLVXCPffAqFEtVcL++7tKMOspTgJWd9asgT/+saVKWLasdZWw7bZFR2jWezgJWN1bsqR1lbD77i1VwvjxrhLMNoaTgDWUNWvggQdaqoTnn09PQK1UCdttV3SEZo2lWwealzRR0nxJCySdXWX5QZIekrRW0nG5+SMlPShpjqTHJJ2aWzZG0qPZNi+r9YNZ77TZZtDUBJdeCo8+Cg8/DIcdBjffnPoRxo2D889PiWLduqKjNes9Oq0EJPUBFgCHAc8Ds4HjI2J+bp2RwBbAmcAtETE9m98v28daSYOAJ4APRMRySTOB0yNilqQ7gMsj4ndV9u9KoOTWrm1dJSxdmhLGoYfCIYfAXnt5EB2ztrqzEhgPLIyIRRGxFrgBmJRfISIWR8TjQLSZ/3b2HoCBgLLgtgeGRMSsbNl1wDE1xGIl1L8/HHwwXHIJPPIIPPYYHHsszJmTxlzeYQeYPBmuuiqNuOZzBrPa1ZIEhgNLcq+XZvNqImmEpEeARcClEbE8e//SDd2mldt73gMnngjTpqWH3v3P/6T+gz/8ISWLkSPhpJPg2mvTjWxm1r5+NaxTrZyo+VwrIpYC78/O/m+WdOPGbtMsb5dd0vTpT6cqYOHCdOfyb38LX/0qDBmSmo0qzUc77FB0xGb1o5YksBQYmXs9gtQ30CVZP8ATwEHAA8COtW5zypQpf/29qamJpqamru7eSkKCPfZI06mnpqTw5JPpEtQbb4TTT093LVcSQlOT72K23qG5uZnm5uYuv6+WjuG+wFOkjuFlwCxgckTMq7LuNcBtEfHr7PVw4P8iYrWkrYA/AcdGxJOVjmFSR/PtwBURcWeVbbpj2LrN+vWpX2HGjJQY7r8/NR9VKoUPf9hDblrv0K33CUiaCFxO6kOYFhGXSJoKzI6I2ySNA24ChgKrgeURsa+kjwDfBtaTmoC+FxHTsm2OBa4F3gXcERFfbGffTgLWY95+O3Uw33NPSgwPPJCqiEqlcNBBqTnJrJGsXg0DB/pmMbMuW7MGZs1qqRRmz07jMFcqhQ9+EAYNKjpKsxYRaezwP/0JZs5MP594At5800nAbKOtWpWuPqokhUceSYPoHHJImg44AAYMKDpKK5MVK9KJSuWAP2sWvOtdMGFC+nucMCH9jQ4e7CRg1u1efz09BK/SfDRvXvpPV2k+Gjcu3ddg1h3efjvdF1M54M+cmW6WHDOm5YA/YQIMr3KBvZ8dZLYJvPoq3HdfS6Xw7LNw4IEtzUf77ecH4Vntli5NB/rKQX/uXNhxx9YH/Pe+F/rVcF2nk4BZAV5+OY3BPGNGmp5/Pl1xVKkU3vtej8dsyRtvwEMPtT7Lf+ut1s06++8PQ4du2PadBMzqwPLl0Nzc0ny0YkW6N6FSKYwa5ecelcH69fDUU60P+AsWpJOCygH/gAPSTY/d9ffgJGBWh5YsaakS7rknPRyv0sl8yCGw665OCr3Byy+3btaZPTvdf5I/4O+3X89eVOAkYFbnIlIfQiUhzJiROpXzj7jYccfOt2PFWrMmPfo8f5b/0ktpYKRKO/6ECZt+5DwnAbMGE5GaDCqVwowZqT04Xylsv33RUZZbBDz3XOsD/qOPwm67tT7L33PP4vt+nATMGtz69emmn0qVcN99KQnkn3s0bFjRUfZuK1emppz8Qb9v39YH/LFjYfPNi470nZwEzHqZdetSs0Ol+eiPf4SddoKdd04VQ2XaaqvWr/PTllsWf4Zar9atS0k3f8B/7jkYPbr1FTsjRjRGv42TgFkvt3ZtSgrPP5+uOspPr7zyznkrVsBrr6VnIVVLEB0lj8qyzTfvPUlk2bLWB/wHH0w3XVXa8A84ID0ypFFv/nMSMLN3WLcuNXFUSxCdJZAVK+DNN2GLLbqePCrT4MHFnEWvWpUeFFg54M+cme7+zh/wx4/vXU+QdRIws2739tvpLukNTSJvvZWapLqaPCrTwIGdJ5HKwEL5s/wnn4R99mndrLPbbo3RrLOhnATMrO6sXbvhCWTFilTJtJc4Bg9Oz3KaOTNVK/kD/ujRKYGUiZOAmfU6q1e3X4msXJnGgpgwwZfSgpOAmVmp1ZoEekk/v5mZbQgnATOzEnMSMDMrMScBM7MScxIwMysxJwEzsxJzEjAzKzEnATOzEnMSMDMrMScBM7MScxIwMysxJwEzsxKrKQlImihpvqQFks6usvwgSQ9JWivpuNz890t6QNJjkh6W9A+5ZddIekbSXElzJL2vez6SmZnVqtMkIKkPcCVwOLAPMFnSnm1WWwScBFzfZv4bwD9GxL7AEcBlkrbILf9KRIyOiDER8eiGfoiyaG5uLjqEuuHvooW/ixb+LrqulkpgPLAwIhZFxFrgBmBSfoWIWBwRjwPRZv7TEfHn7PdlwIvANl3cv2X8B97C30ULfxct/F10XS0H4eHAktzrpdm8LpE0HuhfSQqZb2TNRN+W1KDDOZuZNa5akkC1QQm6NMqLpB2A64CTc7PPiYi9gP2BYcA7+hrMzKyHRUSHE3AAcGfu9TnA2e2sew1wXJt5Q4CH2s5vs87BwC3tLAtPnjx58tT1qbPje0TQj87NBnaTtBOwDDgemNzB+n+tHLImnt8AP4mI6a1WkraPiOWSBBwDPF5tY7UMj2ZmZhumpjGGJU0ELic1H02LiEskTQVmR8RtksYBNwFDgdXA8ojYV9IJwNXAE6TkEMDJEfF4fPkBAAACr0lEQVSopLuBrbP5DwOfi4g3u/8jmplZe+p+oHkzM+s5dXuJpqRpkl6QVOr7BySNkHSPpCezm+7OKDqmokgaIGlmdoPhY5IuLDqmoknqk91seUvRsRRJ0nOSHsn+NmYVHU+RJG0p6VeS5kl6QtKEDtev10pA0oeA14HrIqK0dxNL2h7YPiIelrQ5qZN9UkTMLzi0QkgaFBFvSuoL/BE4IyJK+59e0peAscAWEXF00fEURdIzwNiIeKXoWIom6Vrg3oi4RlI/YFBErGxv/bqtBCLifqD0/6ARsTwiHs5+fx2Yxwbcp9Fb5PqNBgD9SP1MpSRpBHAk8OOiY6kDoo6PZ5uKpCHAQRFxDUBEvN1RAgB/aQ1F0s7AfsDMYiMpTtb8MRdYDvx3RMwuOqYCfRc4ixInwpwAfidptqTPFh1MgXYFXs6ezTZH0lWSBnb0BieBBpE1Bd0IfDGrCEopItZHxGhgBDBB0t5Fx1QESR8DXsiqRFH9ps4y+WBEjCNVRp/PmpPLqB8wBvh+RIwB3iTd29UuJ4EGkLXr3Qj8NCJuLjqeepCVuM3AxIJDKcqBwNFZW/gvgEMkXVdwTIWJiOXZz5dIl6uPLzaiwiwFlkTEg9nrG0lJoV31ngR8hpNcDTwZEZcXHUiRJG0tacvs94HAR4BSdpBHxHkRMTIidiXdwHlPRPxT0XEVQdKgrFJG0mDgo7Rz82lvFxEvAEsk7ZHNOgx4sqP31HLHcCEk/RxoAoZJWgxcWOnsKBNJBwInAI9lbeEBnBcRdxYbWSF2AH6SPd68D/DLiLij4JiseNsBN0kK0jHt+oi4q+CYinQGcH32xIZngE91tHLdXiJqZmY9r96bg8zMrAc5CZiZlZiTgJlZiTkJmJmVmJOAmVmJOQmYmZWYk4CZWYk5CZiZldj/BzMCJxTf/3imAAAAAElFTkSuQmCC",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x10c077048>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "plt.plot(params, test_scores)\n",
    "plt.title(\"max_depth vs CV Error\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "惊了，深度为5的时候，错误率缩小到0.127"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "这就是为什么，浮躁的竞赛圈，人人都在用XGBoost :)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
